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PACKAdNG CraX LINES FOR HI V>I^RIVED KHIItOVIRAL VECTOR PARTICLES 

BACKGROUND OF THE INVENTION 

Retrovizal vectors based on lendviruses, such as buamn hnnnmodefidency 
viruses (HTV), can infect nondividing cells, and integration of proviral DNA occurs 
5 without the need for cell division. These properties make lentiviruses attractive for gene 
transfer into nondividing cells, such as hepatocytes, myofibers, hematopoietic stem 
cells, and neurons. 

However, tte nse of lentivirus vectors, particularly HIV vectors, particularly for 
gene therapy, is hampered by concern over their safety. Thus, a need for the 
1 0 development of lend virus vectors^ particularly HTV vectors, with improved safety, 
particularly for gene therapy, exists. 

SUMMARY OF THE INVENTION 

The present invention rdates to novel packagmg cell lines useful for generadng 
viral accesscxry protein independent lendvirus-derived, particularly HIV-derived, 
. 15 renoviral vector particles, to con&ti uc t i on of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryodc cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as baaerial cells). In a 
pa efe ire d embodiment, the packaging cell lines of the present invention are stable 
20 padca^mg celllines. 

In one embodiment of theurvention, packaging cell lines fbr producing a viral 
accessory protein independent lentivirus-dedved retroviral vector particles comprise (a) 
a cell (e.g^ iriammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpoL wherein said coding sequence has 
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been codon optimized by mutagenisis to improve expression of the lentivinis gagpol 
proteins. 

In second embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles ~* 
comprise (a) a cell (e.g^ mammalian cell); (b) a first retroviral nucleotide seqtience in 
the cell which comprises a coding sequence for lentivirus gagpoh wherein said coding 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirtis gagpol proteins; and (c) a second retroviral nucleotide sequence in the cell 
which comprises the coding sequence for a heterologous envelope proteiiL 

In a third embodiment of the invention, packaging cell Imes for producing a viral 
accessory protein independent lentivirus-derived retroviral vector pardcles comprise (a) 
a ceD (e.g., mammalian cellX (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentiviius gagpoU herein said coding sequence has 
been codon optimized by mutagenisis to improve expression of the lend virus gagpol 
proteins; (c) a second retroviral nucleotide sequence in the cell which compdses the 
coding sequence for a heterologous envelope protein; and (d) a third retroviral 
ntickotide sequence which conqmses a DNA sequence of mterest and lentivirus cis- 
acting sequences required for packaging, reverse transcription and integration. 

In a fourth erobodunent of die invention^ packaging cell Imes for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles 
comprise (a) a cell (e.g^ manamalian cell); (b) a retroviral nucleotide sequence in the 
ceil v(duch comprises a coding sequence for lentivirus gagpol, vkfherein said codmg 
sequence has been codon optimized by mutagenisis to improve expresdon of the 
lentivirus gagpol proteins; and (c) a retroviral nucleotide sequence ^diicb conqnises a 
DNA sequence of interest and lentivinis ds-acting sequences lequiied for packaging, 
reverse transcription and integration. 

In a fifth embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
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(e.g., monunalian cell^ and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HTV gagpoU wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins. 

In sixth embodiment of the invention^ packaging cell lines for producing a viraT 
accessory protdn mdependent HIV-derived letrovtral vector particles comprise (a) a cell 
(e.g^ mainmalian cell); (b) a first retroviral nucleotide sequence in the cell '^ch 
ccModpnses a coding sequence for HIV gagpol^ wherein said cocfing sequence has been 
codcm optiniized by mutagenisis to improve expression of the HTV gagpol proteins; and 
(c) a second retroviral nucleotide sequence in the cell which comjvises the coding 
sequence for a heterologous envelope protein. 

In a seventh embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral imcleotide sequence hi the cell which 
comprises a coding sequence for HIV gagpol^ M^erein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HTV gagpol proteins; (c) 
a second retroviral nucleotide sequence in the cell which comprises the coding sequence 
for a heterologous envelope protein; and (d) a third retroviral nucleotide sequence which 
comprises a DNA sequence of interest and HTV cis^acting sequences required for 
packaging^ reverse transcription and integratxoiL 

In a eighth embodhnent of the mventioD, packagmg cell lines for produdng a 
viial accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a ceD (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HFV gagpol, v\dierein said coding sequence has been 
codon optimized by mutagenisis to unprove expression of the HIV gagpol proteins; and 
(c) a retroviral nucleotide sequence v^ch comprises a DNA sequence of interest and 
HTV cjSrdddDg sequences required for packaging; reverse transcripticMi aixl integratKMi. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1) a letroviral nucleotide sequence which conqirises a codon optimized gag 
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coding sequence and (2) a retroviral nucleotide sequence which comprises a codon 
Optimized pol coding sequence, in place of the retroviral nucleotide sequence which 
comprises a codon optimized gagpoi coding sequence. 

In a particular embodiment, the heterologous envelope protdn is the G " 
S giycoprotraa of vesicuhir stomatitis virus (VSV G). In another embodiment, the 

15 

heterologous envdope protein is the amphotiopic envelope of the Moloney leukemia 
virus (MLV). 

Cell lines for producing a viral accessory protein independent ientivirus-derived 
20 retroviral vector particles are produced by transfecting host cells (e.g., mammalian host 

1 0 cells) with a plasmid comprising a DNA sequence which encodes lentivirus gagpoi 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
improve expression of the lentivirus gagpoi proteins. Depending xspon the particular 
cell Ime being produced, the host cells are also co-transfected with a plasmid 
comprising a DNA sequence which encodes a heterologous envelope protein, or a 
1 5 plasmid comprising a DNA sequence of interest and lentivirtts cis-acting sequences 
^ required for packaging, reverse transcription and integration, or both of these plasmids. 

Ahematively, host cells are transacted with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence oocoding a lendvirus pol protein, in place of the plasmid 

35 

20 comprising a codon optimized DNA sequence encoding both lentivirus gagpoi proteins. 
CeD lines for producing a viral accessory protein independent HTV-derived 
retroviral vector particles are produced by co-transfecting host cells (eg., mammalian 
^ host cells) with a plasmid comprising a DNA sequence which encodes HIV gagpoi 

proteins* wherein said DNA sequence has been codon optimized by mutagenisb to 
25 improve expression ofthe HIV gagpoi proteins. D^endingiqxm the particular cell line 
^ b^og produced, the host cells are also co-transfectcd with a plasmid comprising a DNA 

sequence vMch encodes a heterologous envelope protein, or a plasmid comprising a 
DNA sequence of mtcrest and HIV cis-acting sequences required for pack^ing, reverse 

50 . 
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transcription and integration, or both of these pJasmids. Alternatively, host cells are 
transfected with a plasmid comprising a codon optimized DNA sequence eocoding a 
HIV gag protein and a plasmid comprising a codon optimized DNA sequence encoding 
a HIV pol protein, in place of the plasmid comprising a codon optimized DNA sequence 
eocodmg both HIV gagpol proteins. 

Hie present invention also relates to methods of producing viral accessory 
protein independent lenti^orus-derrved retroviral vector particles, comimsing co- 
transfecdng host cells (e.g.» maxrunaEan host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes lentivirus gagpol proteins, wherein said DNA sequence 
has been codon optimized by niutagenisis to improve expression of die lentivirus gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protdn; and (c) a third plasmid comprismg a DNA sequeiice of 
interest and lentivirus cis-acting sequences required for packa^ng, reverse transcription 
and integration. Alternatively, host cells are transfected with a plasmid comprising a 
codon optimized DNA sequence encoding a lentivirus gag protein and a plasmid 
comprising a codon optimized DNA sequence encoding a lentivirus pol protein^ in place 
of the first plasmid comprising a codon optimized DNA sequence encoding both 
lentivirus gagpol proteins. 

In a particular embodhncnt, the invention relates to methods of producing viral 
accessory protein independent HIV-derived retroviral vector particles, comprising co- 
transfecting host cells (c.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes HIV gagpol proteins, ^^toein said DNA sequence has 
been codon optimized by mutagenisis to improve expression of the tnV gagpol 
proteins; (b) a second plasmid comprising a DNA sequence wfaidi encodes a 
heterologous envelope protein; and (c) a third plasooid comprising a DNA sequence of 
interest and HTV cis-acting sequences reqjuiied for pad^^jng, reverse transcription and 
integration. Alternatively, host cells are transfected with a plasmid comprising a codon 
optimized DNA sequence encoding a HTV gag protein and a plasmid comprising a 
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codon optimized DNA sequence encoding a HIV pel pxoteia, in place of the first 
plasmid conqmsing a codon optimized DNA sequence encoding both HIV gagpol 
proteins. 

The present invention also relates to viral accessory protein-independcat — 
5 retroviral partides produced by or obtainable by (obtained by) the methods described 
herem. 

The present invention further relates to isolated DNA encoding a codon 
optimized lentivinis gagpoh isolated DNA encoding the gag coding region of a codon 
optimized lentivims gagpol, and isolated DNA encoding the pol coding region of a 

10 codon optimized lentivinis gagpoL In a particular embodiment, the present invention 
relates to isolated DNA encoding a codon optimized HTV gagpol, isolated DNA 
encoding the ga^ coding region of a codon optimized HIV gagpol, and isolated DNA 
encoding the /»>/ coding region of a codon optimized HIV ga£|po/. 

The packaging cell lines and viral particles of the present invention can be used 

1 5 for gene ther^ or gene replacement with improved safety. The packaging cell Imes 
and viral particles of the present invention can also be used in development and 
production ofvacdnes, and m production of biochemical reagents. Genethei^ • 
vectors produced with the cell lines of the present invention are expected to be valuable 
medical therapeutics. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic diagram of an expression cassette containing the codon 
optimized gagpol genes. The DNA was constructed m multiple segments, which are 
indicated at the top as 1/3, M. 3/3 (A, B,C and D) and HIN. Restriction sites used to 
. assemble the doned segments are indicated above the kflobasepair (Kb) ruler. Below 
25 the ruler arc multiple features showmg the location of the human cytomegalovirus 
(CMV) promoter, human betaglobm sequences (Bglobin), mRNA sequences (thinner 
line represents intronic sequence), the gag and pol open reading frames, die individual 
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proteolytic fragment coding sequences (pl7_MA, p24_CA, p7, p6, PR, p51__RT, 
RNaseH and integrase (IN)) and each synthetic oHgonncleotide used in the assembly 
process (multiple adjacent open arrows). 

Figure 2 is a table which depicts codon usage frequencies in genes vidiich are ' 
highly expressed and in the codon optimized gagpol open reading frame of the HTV 
packagmg construct described herein. 

Figure 3 is a schematic representation of the HIV proviius and a three-plasmid 
expression system used for generating a pseudotyped HIV-based vector by transient 
transfection as described in Naldini et oLy Science, 272:263-267 (1996). 

Figure 4 is a list of some characteristics relating to the HIV Rev protein. 

Figure S is a list of some points relating to codon optimization of HIV gagpol. 

Figure 6 is a partial DNA sequence of HTV gag (SEQ ID NO: I), showing 
inacdvation of inhibitoiy sequences as described in Schwartz, S. et a/., J. Vhrol, 
5(5(12):71 76-71 82 (1992). 

Figure 7 a plot of the %(G+C) content of wDdtypc HTV gagpol sequences and 
theoretically codon optimized HTV gagpol sequences. The percent of bases, either G or 
C, was calculated for a 30 nticleotide moving window for the entire lengdi of the gagpol 
gene» and the value plotted versus nucleotide position. Diamonds - HIV gagpol 
. sequences; squares = full optimal back-translation for gag open reading frame; 
triangles = full optimal back-translation for pal open reading frame; CO = codon 
optimized. 

Figtves 8A-8E depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the gag coding region of a wildtype HIV gagpol and a codon 
opdmized HIV gagpol, ^114-3 geobank.SEQ" indicates the nucleotide sequence (SEQ 
ID N0t2) and predicted amino acid sequence (SEQ ID N0:3) for the gag coding resjon 
of a wildtype HIV gagpoL "pHDMHaraHscq" indicates the nucleotide sequence (SEQ 
ID N0:4) and predicted amino acid sequence (SEQ ID N0:5) for the gag coding region 
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of a codon optimized HIV gagpol. The "NL4-3 genbank.S£Q" sequences are publidy 
available at the NIH GenBank sequence lepositoiy (Accesssion No. MI9921). 

Figures 9A'9L depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the po/ coding region of a wildtype HTV gagpol and a codcar 
5 optimized HTV gagpo/. "NL4-3 genbanlcSEQ" indicates a nucleotide sequence (SEQ 
ID N0:6) and a predicted amino add sequence (SEQ ID NO:7) for the pol coding 
region of a wildtype HTV gagpol available in the NIH GenBank sequence repository 
(Accesssion No. Ml 9921). The nucleotide and amino add sequences for the pol coding 
20 region available in the GenBank sequence repository contain two sequence errors^ 

10 which are indicated in Figures 9A-9L with shading. "pNL^^'* indicates the correct 
nucleotide sequence (SEQ ID N0:8) and predicted amino acid sequence (SEQ ID 
N0:9) for thepo/co<«ng region ofa wildtype HIV gflrgpo/. ''pHDMHg]»n2.seq** 
indicates the nucleotide sequence (SEQ ID NO: 1 0) and predicted amino acid sequence 
(SEQ ID NO: 1 1) for the pol coding region of a codon optimized HIV gagpoL 
15 Figures lOA-lOD depict the DNA sequence (SEQ IDNO:12) for pHDMHg)m2. 

30 The CMVeiibancei/pramoter is at nucleotides 97 to 679» human betaglobmseq^ 

(Bglobin) are at nucleotides 761 to 864» 865 to 1303 and 5710 to 6469 (end of Bglobxn 
is at nudeotdes 6445 to 6469X mKNA sequences are at nucleotides 680 to 778 and 1255 
to 5921, SV40 origin of replication is at nucleotides 8796 to 8908, beia-lactamasc (bla) 
20 coding region is at nucleotides 6709 to 7569, mtron sequences are at nucleotides 779 to 
1254, the codon optimized gag coding region is at nucleotides 1318 to 2820, the codon 
optimized pol coding region is at nucleotides 261 9 to 5624 and the poly A site is at 
^ nucleotides 5897 to 5»21. 

Figure II isaciicularmapofp!aanidpHDNfHgpm2. 
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25 DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to novel packaging ceH lines useful for generating 
vird accessory protein independent lentivirus-derived, particularly HrV-^iOT 
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retroviral vector paitides, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retrovirai vector paiticJes to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (pardculaiiy 
mannnaKanXplam or yeast cells or prokaryotic cells such as bacterial cells). In a ~~ 
5 paiticiilar embodiment, the packaging cell lines of the present invendon arc stable 
packaging cell lines. 

The cell lines are engineered to express the lentivirus proteins necessary for 
virus f^cle formation (gagpol proteinsX without containing DNA sequences fiom 
lentivirus accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev response 

1 0 element (RRE)). Additionally, no viral sequencei; <siich as cis-acting elements termed 
constitutive transport elements (CTEs)) will be expressed as RKA of any kind. DNA 
sequences for lentivinis gagpol are codon optimized by extensively mutagenizing the 
sequences to improve expression and to reduce die risk of recombination between 
transfer vector sequences and gagpol messenger RKA. This greatly improves the safety 

15 of virus preparations generated firom these cell lines. In a particular embodiment, the 
DNA sequences for lentivirus gagpol are not codon optimized in the overlap region 
between the gag and pol sequences and in ds-acting signals necessary for translation of 
pot. 

Examples of lentiviiuses inchide human immimodeficiency viruses (e.g., HIV-1, 
20 HIV-2, HIV-3), bovine lenti^iruses (e.g., bovine immunodeficiency viruses, bovine 
immunodeficiency-like viruses, Jembrana disease viruses), equine lentivinises (e.g., 
equine infectious anemia viruses), feline lentiviiuses (e.g., feline immunodeficiency 
vtnises»pandierlentiviruses»pumalentivin]sesX ovine/caprine lentiviiuses (e.g., 
Brazilian c^srine lentivinises, capnne artfuitis-encephaHtis viruses, Maecfi-Viaia 
25 viruses, Maedi->nsna-Iike viruses, Maedi-Visna-relatcd viruses, ovine lentiviiuses, 
Visna lentiviruses). Simian AIDS rctroviiuses (e.g., human T-cell lymphotropic virus 
type 4), simian immimodeficiency viruses, simian-himian immunodeficiency viruses, 
human lymphotrophic viruses (e.g., type III), simian T-cell lymphotrophic viruses. 
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in another embodiment, cell lines are engineered to express the HIV proteins 
necessary for vims particle formatioD (gagpol pTOteins)^ without containing DNA 
sequences fiom HIV accessory protems (tat, vif» vpr, vpu, nef and rev proteins and Rev 
re^nse dement (RUE)). Additionally, no viral seqoences (such as cis-acting elaneots 
tamed constitutive transport elements (CIlEs)) will be expressed as RNA of any kind 
DNA sequences for a HTV gagpol are codon optimized by mutagenesis to improve 
expression and to reduce the risk of recombination between transfer vector sequences 
and gagpol messenger RNA. In a particular embodiment, the DNA sequences for HIV 
gagpol are not codon optimized in the overlap region between the gag and pol 
sequences and in cis>acting signals necessary for translation of poL 

Aheniadvely, each of the packaging cell lines desoribed herein can be produced 
u^g (1) a nucleotide sequence wfaicfa comprises a codon optimized gag coding 
sequence and (2) a nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the nucleotide sequence which comprises a codon optimized 
gagpol coding sequence. In this embodiment, the gag and pol coding sequences can be 
completely codon optimized 

Benefits of Ae present invention include the removal of potentially harmful 
lentivirus accessory iwotcins and other viral sequences, and the reduction of the risk of 
recombination to produce replication competent virus. 

Packaging cell lines for producing a viral accessory protein independent 
Icotivirus-dciived retroviral vector particles comprise a mammalian cell and a retroviral 
nucleotide sequence comprising a coding sequence for a lentivirus gagpol vAuxh has 
been codon optimized. In a particular embocfiment the packaging cell lines iurffaer 
con^irise a retroviral nucleotide sequence comprising a codmg sequence for a 
heterologous envelope protein. In a second embocGment, the packaging ceU lines 
fwther comprise a retroviral nucleotide sequence compriring a coding sequence for a 
heterologous envelope protein and a retroviral nucleotide sequence vfiuch comprises a 
DNA sequence of interest and HIV ds-acting sequences required for packaging, reverse 
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transciiption and integration. In tMni embodiment, the packaging ccH lines further 
coropiise a retroviial nucleotide sequence which comprises a DNA sequence of interest 
and HIV cis-acting sequences required for packaging, reverse transcriptioo and 
integradon. Alternatively, the packaging cetl lines ofthe present invcntioD comprise? 
5 retroviral nucleotide sequence which coniprises a codon opdmized gag coding sequence 
and- (2) a retroviral nucleotide sequence which comprises a codcm optimized pol coding 
sequence, in place of the retroviral nucleoticfe sequence which comprises a codon 
optimized gagpol coding sequence. 
20 The coding sequence(s) for lentivirusga,gpo/ which has (have) been codon 

1 0 optunized results in improved expression of the lentivinis gagpol proteins and reduces 
the risk of recombination between the transfer vector and gagpol messenger RNA. 
Codon optimization of the coding sequencc(s) for lentivinis gagpol was obtained 
mutagenizing for each particular amino acid residue, specific nucleic acid bases in a 
codon for the particular amino acid residue to a nucleic acid base ^ch is present in a 
15 codon v^icb occurs at a high irequeiicy in genes which are highly expressed fiar ^ 
30 same amino acid residue. In a patticnhffembodiinent, the resulting optimized codon 

also does not cause introduction of mRNA ^dng signals into the codon c^jtimized 
sequence. Thus, in a particular embodiment, codon optimization of the coding 
sequence(s) for lentivirus gagpol is obtained by mutagenizing for each particular amino 

35 

20 acid residue, specific nucleic acid bases in a codon for the particular amino acid residue 
to a rrocleic acid base that is present in a codon which (1) occurs at a high frequency ui 
genes which are highly expressed for the same amino acid residue and (2) does not 
^ cause introduction of mRNA splicmg dgnals into the codon optimized sequence. 

Codon optimization typically results m the removal of nucldc add base A-rich 

25 mstabillty elements. 

^ In a particiilar embodiment, the coding sequence for a HIV ga£po/(pNL4'3; 

available through the AIDS repository, NIH; Adadii et oL, J, ViroL, 59:284-291 (1986)) 
has been codon optimized to in^jiove translational effidency of the HIV gagpol 
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j»T>teins and reduce the risk of recombination between the transfer vector and HIV 
gagpol messenger RNA. Two hundred thirty-seven base pairs (237 bp) consisting of the 
gag pol overlap and cis-acting signals necessary for translation of pol (nucleotides 2583 
to 2819 of SEQ ID NO: 12) were not optimized. The HTV gagpol sequence obtained^ 
5 udng the codon optimization i^ocess does not differ at the amino add level fi:om 
vdldtype HTV gagpol sequence* but diffeis at the nucleotide level fhnn the HIV gagpol 
sequence. A codon optimized HTV gag sequence is shown in Figures 8A-8E 
(pHDMHgpm2.seq) (SEQ ID N0:4). A codon optimized HIV/70/ sequence is shown in 
20 Figures 9A-9L (pHDMHgpm2.scq) (SEQ ID NO:l 0). 

] Q A plasmid comprising DNA sequences wluch encode codon optimized ientivinis 

gagpo/ proteins is also referred to herein as a packaging construct This plasmid 
includes a promoter which drives the expresdon of the gagpol proteins, such as the 
human cytomegalovirus OiCMV) immediate early promoter. This plasmid is defective 
for the production of the wal envelope and accessory proteins tat, \-if, vpr, vpu, nef and 
15 rev and the Rev response element (RRE). The packaging construct also does not 
^ contain viral sequences which arc transcribed into mRNA, such as constitutive transport 

elements (CTEs). 

A packaging construct comprising a codon optimized HIV gt^pol is depicted in 
^ Figure 1 and in Figuie 11. Figures 10 A- lOD depict the DNA sequence (SEQ ID 

20 N0:12) for the packaging construct pHDMHgpm2. This packaging construct 
(pHDMHgpin2) was constructed as follows: Plasim'd pMDA.HTVgp mam was 
generated by chemical synthesis and PGR assembly (which is described in, for example, 
^ Stemmer et aL, Gene, 164:49-53 (1995)) of 215 Cerent oligonucleotides. The DNA 

sequence for pMDA.HIV^ mam is the same as the DNA sequence for 
25 pNG)A.HIVgpjtg except for 4.3 kb which was codcm optimized using the DNAStar 
^ program (^.aseiGene, Madison, WI). Two hundred thirty-seven base pairs (237 bp) 

consisting of the gag pol overlap and cis-acting signals necessary for translation of pol 
(nucleotides 2583 to 2819 of SEQ ID NO: 12) were not optimized due to dual reading 

50 



55 



15 



WO00M5S19 PCTAS99/20675 

-13- 

frame constraints. A Ns3 site 5* of IN was preserved to aid fiision with wUdtype 
sequences. Several single or double base pair silent mutations were introduced either to 
prevent potential splice donors and acceptors, or by the synthesis process. 
pMDA.HIVgp jtg was derived firom HIV- 1 strain NL4-3. The protease mutation tharis 
5 present in the NL4-3 NIH GenBank sequence was then repaired (Figure 9B), changing 
the nucleotide present at position 2948 of SEQ ID N0:12 from a "G" to a "C", thereby 
producing the codon present at nucleotide positions 2948 to 2950 of SEQ ID NO: 12 
which encodes an arginine instead of the glycine present in the NL4-3 GenBank anrino 
20 acid sequence. The resulting plasmid was named pMDHgpmam. The EcoRI-Hindlll 

1 0. fragment of pMDHgpmam was inserted into pHDM2b, a high copy version of the pMD 
vector (Oiy, D. a al„ Froc. Natl Acad. Set USA. 93(21):n400-l 1406 (1996)). to 
produce plasmid pHDMHgpm. The sequencing mutation that is present in the RNase 
domain of the NL4-3 NIH GenBank sequence was repahed (Figure 9H), changing the 
codon present at nucleotide positions 4724 lo 4726 of SEQ ID NO;12 from "GGG" to 
15 "AAG", thereby producing a codon encoding a lysine instead of the glycine present in 
30 the NL4-3 GenBank amino acid sequence. The resulting plasmid was named 

pHDMHgpm2. Codon psage frequencies m the codon optimized gagpol open reading 
frame of the packaging construct pHDMHgpm2 are shown in Figure 2. 

As used herein* a heterologous envelope protem permits pseudotyping of 

35 

20 particles generated by the packaging construct and includes the G glycoprotein of 
vesicular stomatitis virus (VSV G) and the amphotropic envelope of the Moloney 
leukemia vims (MLV). A plasmid comprising a DNA sequence which encodes a 
^ heterologous envelope protein is also referred to herein as an envelope coding plasmid. 

The terms "mammal" and "mammalian", as used herein, refer to any vertebrate 
25 animal, including monotremes, znarsiq[)ials and placental, that suckle their young and 
dther give birth to Hving young (eutharian or placental mammals) or are egg>Iaying 
(metatbarianornonplacentalmanunals). Examples of mammalian species include 
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humans and other primates (e.g., monkeys, chimpanzees), rodents (e.g., rats, mice, 
guinea pigs) and nuninents (c,g., cows, pigs, horses). 

Examples of mammalian cells inchide human (such as HeLa cells, 293T cells, 
KIH 3T3 cells), bovine, ov'me, porcine, naurine (sudi as embryonic stem cells), rabbit* 
and monkey (such as C0S1 cells) cells. The cell may be a non-dividing cell Qnclu^ng 
hepatocytes, nxyofibers, hematopoietic stem cells, neurons) or a dividing cell. Hie cell 
mi^ be an embryonic cell, bone manrow stem cell or other progenitor cell. Where the 
cell is a somatic cell, the cell can be, for example, an epithelial cell, fibroblast, smooth 
muscle cell, blood cell (including a hematopoietic celL red blood cell, T-cell, B-cclL 
etc.), tumor cell, cardiac muscle cell, macrophage, dendritic cell, neuronal cell (e.g., a 
glial cell or astrocyte), or pathogen-infected cell (e.g., those infected by bacteria, 
viruses, virusoids, parasites, or prions). 

Typically, cells isolated from a specific tissue (such as epitheliimi, fibroblast or 
hematopoietic cells) are categorized as a *'cell-type." The cells can be obtained 
commercially or from a depository or obtained directly from an animal, such as by 
biopsy. Ahematively, the ceU need not be isolated at all from the animal where, for 
exan^le, it is destiable to deliver the virus to the animal in gene therapy. 

To produce the cell lines of the present invention for prodndng a viral accessory 
protein independent lentiviius-derived retix>viral vector particles, mammalian host cells 
are co-transfected with (a) a first plasmid comprising DNA sequence which encode 
lentivirus gagpol proteins, wherein said DNA sequence has been codon optimized by 
mutagenisis, ds described above, to improve expression of the lentivirus gagpol 
protein^ and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a r etro vir al nucleoti<te sequence which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration, or both> uixier conditions appropriate for 
transfection of the cells. 
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In a particular embodiment, to produce the cell lines of the present invention for 
producing viral accessory protein independent HI V-derived reuoviraJ vector particles 
mammalian host cells were cotransfected with (a) a first plasmid comprising DNA 
sequence which encode HTV gagpol proteins, wherein said DNA sequence has been ^ 
S codon optinnzed by mutagenisis, as described above, to iniprove expression of the HIV 
gagpol proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HTV cis-acling sequences required for packaging, reverse 
20 transcription and integrarion, or both, under conditions appropriate for transfection of 

10 the cells. 

Virus stocks consisting of viral accessory protein independent lentivirus-deiived, 
particularly HTV-derived, retroviral vector particles of the present invention are 
produced by maintaining the transfected cells under conditions suitable for virus 
production (e.g., in an appropriate growth media and for an appropriate period of time). 
1 5 Such conditions, which are not critical to the mvention. are generally known in the art. 
^ See, e.g., Sambrook et al.. Molecular Chnmg: A Laboratory Manual^ Second Edition, 

Cold S1n^ng Harbor University Press, New York (1989); Ausubel et a/., Cwrent 
/Vofocofr OTA/o/«:u/arBwtogy, John Wiley & Sons, New York (1998); U.S. Patent 
^ No. 5,449,614; and U.S. Patent No. 5.460,959; the teachings of which are incorpors^ 

20 herein by reference. 

To generate viral accessory protein independent lentivirus-derived retroviral 
vector particles, mammalian host cells can be co-transfected with (a) a first plasmid 
^ CQnq>rising DNA sequence which encode lentivirus gagpol proteins, wherein said DNA 

sequence has been codon optimized by tnutagenisis, as described above, to Improve 
25 esqinession of the lentivirus gagpol proteins; (b) a second plasmid comprising a DNA 
sequence which encodes a heterologous envelope protein; and (c) a third plasmid 
comprising a DNA sequence of interest and lentivirus cisracting sequences required for 
packaging, reverse transcription and integration. Alternatively, mammalian cells are 
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transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
lentivirus gag protein and a plasmid comprising a codon optimized DNA sequence 
encoding a lentivinis poi protein, in piace of the first pJasmid comprising a codon 
optimized DNA sequence encoding both lentivirus gagpol proteins. Alternatively, " 
5 mamroalian host cells are tiansfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivinis gag protein and a plasmid comprising a cqdbn 
optiniized DNA sequence encoding a lentivhus pol protein, in place of the first plasmid 
comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
20 In a particular embodiment, the invendon relates to methods of producing viral 

1 0 accessory protein independent HTV-deri vcd retroviral vector particles, comprising co- 
transfecting mammalian host cells with (a) a first plasmid comprising DNA sequence 
\^ch encode HIV gagpol proteins; wherein said DNA sequence has been codon 
optimized by mutagenisis, as described above, to imi»ove expression of the HIV gagpol 
protein^ (b) a second plasmid containing a DNA sequence which encodes a 
1 5 heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
30 interest and HTV cis-acting sequences required for packaging, reverse transcription and 

integration. Alternatively, mammalian host cells are transfected whh a plasmid 
comprising a codon optimized DNA sequence encoding a HTV gag protein and a 
plasmid comprisiiig a codon optimized DNA sequence encoding a PHV pol protein, in 
20 place of the iiist plasmid comprising a codon optimized DNA sequence encoding both 
HTV gagpol proteins. 

Virus particles produced by the methods described herein, using a codon 
optimized HTV packaging construct produced as described herein, were compared by 
Western analysis with vims particles produced as described in Naldim et Science^ 
25 272:263-267 (I996Xu^ Repackaging construct plasmid pCMVARS^ BoAthe 
^ . immunological reactivity and the proteolytic pnKessmg were confi^^ 

iiuitstineuishBble. 
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A plasmid compiising a DNA sequence of interest and HIV cis-acting sequences 
requiied for packaging, reverse transcription and integration is also referred to herein as 
a transfer vector. A transfer vector, as used herein, refers to a vehicle which is used to 
introduce a DNA of interest into a eurkaiyotic cell, particularly a mammalian cell. ^ 
Figure 3 depicts an example of a transfer vector. 

DNA sequence of interest, as used herein, jndude all or a portion of a gene or 
genes encoding a nucleic add product v/bosc expresson in a cell or a mamma! is 
demed In a particular embodiment, the nucleic acid product is a heterologous 
therapeutic proteiiL Rxamplcs of thenq>eutic proteins include antigens or immunogens, 
such as a polyvalent vaccine, cytokines, tumor necrosis factor, interferons, inierleuJcins, 
adenosine deaminase, insulin. T-cell receptors, sohibic CD4, growth factors, such as 
cpidcimal growfli fecior, human growth fector, insulin-like growth factors, fibroblast 
growth factors), blood factors, such as Factor Vlll. Factor IX. cytochrome b, 
glucocctcbroadase, ApoE, ApoC, ApoAI, the LDL receptor, negative selection markers 
or "suicide proteins", such as thymidine kinase (inchiding the HSV, CMV, VZV TK), 
anti-angiogenic factors, Fc receptois, plasminogen activators, such as t-PA, u-PA and 
sircptokmase, dopamine, MHC, tumor suppressor genes such as p53 and Rh, 
mcHioclonal antibodies or amigen binding fiagmcnts thereof, drug resistance genes, ion 
channels, such as a calchim channel or a potasaum channel adrenergic receptors, 
hormones (including growdi hormones) and anti-cancer agents. In another embodiment, 
the nucleic acid product is a gene product to be e3q)ressed in a cell or a mammal and 
which product is otherwise defective or absent in the cell or mammal. For example, the 
miclcic acid product can be a functional gene($) which is defective or absent in the cdl 
or mammal* 

DNA sequence of interest xnchides DNA sequences (control sequences) which 
arc necessary to drive the exprcsnonofthc gene or genes. The control sequences arc 
qperably linked to the gene. The term "operably linked", as used herein, is defined to 
mean that the gene is linked to control sequences in a manner which allows expression 
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of the gene (or the nucleic acid sequence). Generally, operably linked means 
contiguous. 

Control sequences include a transcnptional promoter, an optional operator 
sequence to control transcription, a sequence encoding suitable niRNA ribosomal ~~ 
binding sites and sequences which control tenninadon of transcription and translation. 
In a pardcular embodimem, a lecombinant gene encoding a desired nucleic acid product 
can be placed under the regulatory control of a promoter which can be induced or 
repiessed, thereby offering a greater degree of control with respect to the level of the 
product produced. 

As used herein, the term "promoter" refers to a sequence of DNA, usually 
upstream (30 of the coding region of a structural gene, which controls the e3q[>ression of 
die coding region by providing recognition and binding sites for RNA polymerase and 
other factors ^ich may be required for initiation of transcription. Suitable promoters 
arc wU known in the art. Exemplary promoters include the SV40, CMV and human 
elongation factor (EFI) promoters. Other suitable promoters are readily available in the 
art (see, e.g^ Ausubel et aL, Current Protocols in Molecuiar Biology, John Wiley & 
Sons, Inc., New York (1998); Sambrook et al^ Molecular Clomng: A Laboratory 
ManuaL 2nd edition* Cold Spring Harbor University Press, New York (1989); and U.S. 
Patent No. 5.681,735). 

A DNA seqnrace of interest can be isolated from nature, modified from native 
sequences or manufactured de novo, as described in, for example, Ausubel et al,. 
Current Protocols in Molecular Biology, John Wiley & Sons, New York (1998); and 
Sambrook et aL, Molecular Cloning: A Laboratory Manual^ 2nd edition. Cold Spring 
Harbor University Press, New York. (1989). DNA sequences can be isolated and fused 
togedier by methods known in the art, such as exploiting and manufacturii^ compatible 
cloning or rcstrfction sites. 

The packaging ceU lines and viral particles of the present invention can be used, 
in vitro, in vivo and ex vivo, to introcbce DNA of interest into a eukaryotic cell (e.g., a 
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mamxnalian cell) or a manumal (e.g., a human or other mamma) or vertebrate). The cells 
can be obtained commercially or fiom a depository or obtained directly from a mammal^ 
such as by biopsy. The cells can be obtained from a mammal to ^om diey will be 
returned or from anodier/di£ferent mammal of the same or different species. For " 
S example, using the packaging cell lines or viral particles of the present invention, DNA 
of intoest can be introduced into nonhuman cells, such as pig cells, which are then 
introduced into a htnnan. Alternatively, the cell need not be isolated fiom the mammal 
where, for exan^le, it is desirable to deliver vial particles of the present mventson to the 
manunal in gene therapy. 

10 Ex vivo therapy has been described, for example, in Kasid er aL, Proc. NaiL 

Acad, Set USA, 57:473 (1990); Rosenberg ef al,, N. Engl. J, Med, 323i570 (1990); 
Wlliams et aL, Nature, 570:476 (1984); Dick el aL, Cell, 42:71 (1985); KeMer et id. 
Nature, 318:149 (1985); and Anderson et al,^ United States Patent No. 5399,346. 

Methods for administering (introducing) viral particles directly to a mammal are 

15 generally known to those practiced in the art For example, modes of administration 
include parenteral, injection, mucosal, systenuc, implant, intraperitoneal, oral, 
intradermal, transdermal (eg., in slow release polymers), intramuscular, intravenous 
including infusion and/or bohis injection, subcutaneous, topical, epidural, etc Viral 
particles of the present invention can, preferably, be administered in a pharmaceutically 

20 acceptable carrier, such as saline, sterile water. Ringer's solution, and isotonic sodium 
chloride solution. 

The dosage of a viral particle of the present invention administered to a 
mammal, including frequency of admim'stration, will vary depending iqx>n a variety of 
factors, including mode and route of administration; size, age, sex, health, body weight 
25 and diet ofthe recipient maminal; nature and extent of^ynqitomsofthe disease or 
disorder bdng treated; kind of coiicurrent treatment, frequency of treatmeiit, and the 
effect desired 
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Tbe teachings of ail the articles, patents, patent applications and GenBank 
sequences cited herein are incorporated by lefercnce in their entirety. 

While this invention has been particularly shown and described with references 
* to preferred embodiments thereof, it wiD be understood by those skilled in the ait ihaT 
various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims. 



Claims 



5 



10 



15 



20 



25 



30 



35 



40 



45 



55 



wo 00/15819 



PCT/U^/20675 



-21- 
CLAIMS 

What is claimed is: 

1 . A packa^ng cell line for producing a viral accessory protein independent HIV- 
derived retroviral vectc«' particle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the ceil i^ch comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins: 

c) a second retrovind nucleotide sequence in the cell which compxises the 
coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and HTV cis-acting sequences required for 
packaging, reverse transcription and integration. 

2. A packaging ceQ line of Claim 1 wherein die heterologous envelope proton is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 

3. A packaging cell line of Claim I wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

4. A packaging cell Ime of Claim 1 wherein the DNA sequence of mterest encodes 
a heterotogous theraqxutic protein. 

5. A packaging ceU line comprising: 
a) a mammalian cell; 
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b) a jBrsI ictroviral nucleotide sequence in the cell which compiises a 
coding sequence for a HIV gagpo!^ wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and • 

c) a second retroviral nucleotide sequence in the cell which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

6. A packaging celllineofClaim 5 whernn the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

7. A packaging cell line comprisin g: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence u> the cell vibach comprises a 
coding sequence for a HIV gagpol, wterdn said coding sequence has 
been mutagenized to improve expression of the HFV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which compiises the 
coding sequence fox a heterologous envelope protein. 

8. A method of producing a packaging cell line for producing a viral accessory 
protein independent HTV-derived retroviral vector particle, comprismg co- 
transfccttng mammalian host cells with: 

. a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins^ wherein said DNA sequgure has been mut^enized to improve 
expression of die HTV gag and pol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 
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c) a third plasnud comprising a DNA sequence of inteiest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 

9. . A inethodofGaim 8 wherein the heterologous envelq^ protein is the G 

glycoprotein of vesicular stomatitis virus (VSV GX 

1 0. A method of Claim 8 \dierein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

11. A mediod of Cbim S wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

1 2. A method of producing a viral aocessoiy protein independent HI V-derived 
Tctroviial vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins^ wfaerem said DNA sequence has been nmts^enized to unprove 
nqiressian of the HIV gagpol proteins; 

b) a second plasmid comprising a DNA sequence vt^ch encodes a 
heterologous envelope protein; and 

c) a third plasmid conq)rising a DNA sequence of interest and HTV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 

13. A method of Claim 12 wherein the heterologous envelope jHotein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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14. A method of Claim 1 2 wherein the heterologoDS envelope protetn is the 
amphotropic envelope of the Moloney leiikemia virus. 

15. A method of Claim 1 2 wbetem the DNA sequence of interest encodes a — 
heterologous tber^udc protein. 

16. A packaging cell line for producing a viral accessory protein independent 
lentivirus-derived retroviral vector paiticle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell v/idch comprises a 
coding sequence for a lentivirus gagpol, ^herein said coding sequence 
has been mutz^enized to improve repression of the lentivirus gagpol 
{TOteins; 

c) a second retroviral nucleotide sequence in the cell \^ch comprises the 
coding sequence for a heterologous envelope protein; and 

^ a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and lentivirus cis-actiDg sequences required for 
packaging, reverse transcription and integxatioD. 

17. A packaging cell line of Claim 16 wherein the heterologous envelope protein is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 

18. A packaging cell line of Claim 1 6 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

19. A packaging cell line of Claim 16 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 
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20. A packaging ceU line comprisixig: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell vAach comprises a 
coding sequence for lentivirus gagpoh wherein said coding sequence-has 
been mntagenized to hnprove expression of the lentivirus gagpol 
proteins^ and 

c) a second retroviial nucleotide sequence in the cell which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required 
for packaging, reverse transcription and integiadon. 

21. A packaging cell line of Claim 20 wherein the DNA sequence of inierest 
encodes a heterologous therapeutic protein. 

22. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell vfinch comprises a 
coding sequence for lentivirus gagpol, wherein said coding sequence has 
been mutstgentzed to unprove expression of the lentivirus gagpol 
prc^^ns; and 

c) a second retroviral nucleotide sequence in the ceU ^ich comprises the 
coding sequence for a heterologous envelope protein. 

23. A method of x>roducing a packaging cell line for producing a viral accessory 
protein independent lentivirus-derived retroviral vector particle, conqmsing co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence \>v4uch encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagcnized to 
Improve expression of the lentivirus gag and pol proteins; 
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b) a second plasmid comprising a DNA sequence widch encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and— 
mtegratioD. 

24. . A method of Claim 23 wherein the heterologous envelope protein is the G 

glycoprotein of vesicular stomatitis virus (VSV G). 

25. A method of Claim 23 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

26. A method of Claim 23 wherein the DNA sequence of interest is a heterologous 
thenq)eutic proton. 

27. A method of producing a viral accessory protein independent lentivirus-denved 
retroviral vector particle comprising co>transfecting roammaJian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpoi proteins* wherein said DNA sequence has been mutagenized to 
insprove expression of the lentivirus gagpoi proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid compii^ng a DNA sequence of interest and lentivims 
cis-acting sequences required for packaging, reverse transcription and 
integration. 

28. A noethod of Claim 27 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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29. A method of Claim 27 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

30. A method of Claim 27 wherein the DNA sequeiKe of interest encodes a ^ 
heteiologoiis therapeutic protein. 

5 31. A viral accessoiy protein independem HIV-derived retroviral vector particle 

produced by the method comprising co-tnmsfecting mammalian host cells with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 

proteins, wherein said DNA sequence has been mutagcnized to improve 

expression of the HTV gagpol proteins; 
10 b) a second plasmid conqnising a DNA sequence which encodes a 

heterologous envelope protein; and 
c) a third plasmid comprising a DNA sequence of interest and HIV cis- 

acting sequCTces required for packaging, reverse transcription and 

integration* 

15 32. A method of Gaim 31 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stom^tis virus ( VSV G). 

33, A method of Claim 3 1 v^dierein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 



34. 

20 



A method of Claim 31 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 
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35. A viral accessory protein irKiependent lentivirus-derived retroviral vector 
particle produced by the method comprising co-transfecHng mammalian host 
cells with: 

a) a first plasmid comprising a DNA sequence which encodes tentivims— 
5 gagpol proteins, v^ieietn said DNA sequence has been mutagenized to 

improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein: and 

c) a third plasiiud comprising a DNA sequence ofintcrcst and lentivirus 
1 0 cis^acting sequences required for packaging, reverse transcription and 

integration. 

t 

36. A method of Claim 35 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

37. A method of Claim 35 wlierein the heterologous envelope protein is the 
1 5 amphotiopic envelope of the Moloney leukemia vims. 



38. A method of Claim 35 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

39. Isolated DNA encoding a codon optimized HTV gagpoL 
AO. Isolated DNA encoding a codon qptimized HIV gag. 

20 41. Isolated DNA ofCIaim 40 conqdsii^ the nucleotide seqiience of SEQ ID NO:4. 



42. 



Isolated DNA encoding a codon qptimized HIV poL 
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43. Isolated DNA of Claim 42 comprising the nucleotide sequence of SEQ ID 
NO:10. 

44. A method of introdudng a DNA sequence of interest into a mammal comprising 
introducii^ into said nymmmal a vixal accessory protein independent HIV- 

5 derived retroviral vector particle comprising the DNA sequence of interest. 

45. The method of Claim 44 wherein the mammal is a human. 

46. The method of Gaim 44 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

47. A method of introducing a DNA sequence of interest into a mammal comprising 
10 the steps of: 

a) introducing into ceDs a viral accessory protein independent HIV-derived 
retroviral vector particle comprising the DNA sequence of interest; and 

b) returning the celb obtained in step a) to the mammal. 

48. The method of Claim 47 ^dierein the mammal is a human. 



15 49. 



The method of Gaim 47 wherein the DNA sequence of interest is a heterologous 
thersQ^eutic protein. 
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Rev 

• Regulates HIV gene expression by promoting 
cytoplasmic levels of unspliced and singly spliced 
DiRNAs 

• Postulated to affect splicing, stability, transport, 
and translation 



Fig. 4 
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Codon Optimization of HIV gagpol 

• Remove A-rich instability elements 

• Improve translational efficiency 

• Reduce risk of recombination with 
transfer vector 



Fig. 5 
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ASgnment Report of Codon cptimizafkm (0a0).ME6, using Oustal method with PAM250 midtie ^ighi tables 



810 



792 MGARASVLSGGELDK »M-3 gesibank.SEQ 
792 ATG GGT GCG ACA GCG TCG GTA TTA AGC GGG GGA GAA TTA GAT AAA 
1319 MG ARAS VLSGGZIDK pBDMBgpi&2.8ea 
1319 ATG GGC GCC CGC CCC TCC 6T6 CTG TCC GGC CGC G;VG CTC GAC AA6 _1 * 

p- 1 ^ 

840 870 

^ ^ ^ ^ ^ R ^ G 5 R *K 5 Y Fr»L4-3 oenbanlc.SEQ 
837 TGG GAA AAA ATT CGG TTA AGG CCA GGG GGA AAG AAA CAA TAT AAA 
13M VEKIR LRPGGRKQYK 
1364 TGG GAS MG ATC CGC CTG CGC CCC GGC GGC AAG AAG CVG TAG AAG 

900 

III I^KHIVWASRELBRrA 11X4-3 genbank.SEQ 
882 CTA AAA CAT ATA GTA TGG GCA AGC AGG GAG CTA GAA CGA rrc GCA 
1409 LKBIVWASRELERFA pHI»iEgpa2.sec 
1409 C?G AAC CAC ATC GTG TGG GCC TCC CGC GAG CTG GAG CGC TTC GCC 
I I 
930 960 



927 VHPGLtSTSEGCRQi N14-3 genbank.SEQ 

927 GTT AAT OCT GGC CTT TTA GAG ACA TCA GAA GGC TGT AGA CAA ATA 
1454 VHPGLLETSE 6CRQ1 pBCMBgpni2.9eq 

.1454 GTG AAC CCC GGC CTG CTG GAG ACC TCC GAG GGC TGC CGC CAG AIC 

990 

372 1.GCLQP5LQTGSES L in.4-3 genbank.SEQ 
972 CTG GGA CAG CTA CAA CCA TCC CTT CAG ACA GGA TCA GAA GAA CTT 
1499 LGQLQPSLQTGSEEL plICMK9pa2 . seq 
1499 CTG GGC CAG CTG CAG CCC TCC CTG CAA ACC CGC TCC GAG GAG CTG 

1020 1050 

1017 aSLTSTIAVL YCVHsj HL4-3 genbank.SEQ 

1017 AGA TCA TTA TAT AAT ACA AIA GCA 6TC CTC TAT TGT t?rs CAT CAA 

IS44 RSLYNTZAVLYCVHQ pKDMHgpa^ . seq 

1544 CGC TCC CTG TAC AAC ACC ATC GCC GTG CTG TAC TGC GTG CAC CA6 

1 

1080 

1062 RroVKDTKBALDKI E ia4-3 geabaiik.SEQ 
1062 AGCAIAGATGTAAAAGACACCAASGAAGCCTTAGarAAGAIAGaC 
1589 RZDVKOTKEALOXZE pBZ)MBgpiii2.seq 
1589 CGC ATC GAC GTG AAiG GAC ACC AAG GAG GCC CTG GAC AAC ATC GAG 

lUO IMO 

' I 1 , 

1107 BEQNKSKKKAQQAAA MM-3 genbank.SEQ 
1107 GAA GAG CAA AAC AAA. AGT AAG AAA AAG GCA CAG CAA CCA GCA GCT 
1634 EEQHK5RKSAQQAAA ptimHgpa2.8eq 
1634 GAGGAGCAGAACAAGTCCAAGAAfiAAGGCC CAG CAG GCC GCC GCC 



Fig. 8& 
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Mlynient Rapoft of Codon opSnrinlion (gagXMEG, 
I 

1170 

1152 DTGHNS 'QVSQKYPXV NL4-3 genbank.SEQ 
1152 6AC ACA GGA AAC AAC AGC CAG CTC AiSC CAA AAT tAC CCT ATA OTG 
1679 DTGKNSQVSQ NYPIV pBt»9Igpin2 . seq 
1579 GAC ACC 6CC AAC AAC TCC.CAG CTC TCC CAG AAC TAC CCC ATC CTG — 

1200 1230 

1197 QNI.QGQMVR9AISPR 10.4-3 geabank.SEQ 
1197 CAG AAC CTC CAC GGG CAA ATC GIA CAT CAG GCC ATA TCA CCT AGA 
1724 QHLQGQMVHQAZ SPR pHDMBgpoi2 . seq 
1724 CAG AAC CTG CAG GGC CAG ATG GTG CAC CAG GCC ATC TCC CCC CGC 

1260 

♦ I 

1242 TLKAJTVKVVECKAPS 1114-2 genbank^SBQ 
1242 ACT TTA AAT GCA TGG GTA AAA GTA GTA GAA GAG AAG GCT TTC AGC 
1769 TLKA'^VKVVEEKAFS pH2KBgpin2 . seq 
1769 ACC CTG AAC GCC TGG GTG AAG GTG GTG GAG GAG AAG GCC TTC TCC 

1 — 1 

1290 ^ 1320 

1297 ?BVI PMFSALSEGAT 1IL4-3 genbanJc.SEQ 
1287 CCA GAA GTA AtA CCC ATG TTTTCAGCATTATCAGAAGGAGCCACC 
1814 PEVI PMFSALSBGAT pHDMBgpnZ.seq 
1814 CCC GAA GTC ATC CCC ATG TTC TCC GCC CTG TCC GAG GGC GCC ACC 

t — . 

1350 

1332 ? Q D LNTMLHTV66RQ NL4-3 geobanJcSEQ 

1332 CCA CAA GAT TTA AAT ACC ATG CTA AAC ACA GTG GGC GGA CAT CAA 

1859 PODI.VT MI.llTVGGao pBCMSgpaZ. Aeq 

1859 CCC CAC GAC CTG AAC ACC ATG CTG AAC ACC GTG GGC GGC CAC CAC 

1 1 

1380 1410 

' ■ — - — • 

1377 AAMdMLKETIKSEA A 1IL4-3 geCba&k.SEQ 
1377 GCA GCC ATG CAA ATG TTA AAA GAG ACC ATC AAT GAG GAA GCT GCA 
1904 AANQMLKETZNEBAA pHimHgpie2 . seq 
1904 GCC GCC ATG CAG ATG CTG AAG GAG ACC AIC AAC GAG GAG GCC GCC 

■■- I * 

1440 

^ \ . . . 

1422 EWDRLHFVHAGPIAP VL4-J genbank.SEQ 
1422 GAA TGG GAT AGA TTG CAT CCA GTG CAT GCA GGG CCT ATT GCA CCA 
1949 EVDRLB PVBAGPIAP pRDMHgpma . seq 
1949 GAG TGG GAC CGC CTG CAC CCC GTG CAC GCC GGC CCC ATC GCC OCC 

..p. ..— ..^ I — I.. ■ , ■■ — — » 

1470 1500 

I : ' 

1467 G QN B BP RG5 DIAGTT 1IL4-3 genbank.SGQ 

1467 GGC CAG ATG AGA GAA CCA AGG GGA AGT GAC ATA GCA GGA ACT ACT 
1994 GQMBEPRG53IAGTT pBEMEgpcnZ . seq 

1994 GGC CAG ATG CGC GAG CCC CGC GGC TCC GAC ATC GCC GGC ACC ACC 



Fig. 8B 
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Afignmonl Report of Codon opb'mization (gag).MEQ. using Clustal malhod iMlh FAM250 residUo weight taUs. 

I 

1530 

1512 STLQEQIGWMTHNPP NI*4-3 genbank.SEft 
1512 ACT ACC CTT CAG GAA CAA ATA GGA TGG AT6 ACA CAT AAT CCA CCT 
2039 STLQBQIGWMTHMPP pBOMBgpinl . seq 
2039 TCC ACC CTG CAA GAG CAG ATC GGC TGG ATG ACC CAC AAC CCC CCC 

J ^ — • ' » 

1560 ' IWO 

1557 IPVGEIYRRWIIX'GL HL4-3 genbank.SEQ 

1557 ATC CCA GTA GGA GAA ATC TAT AAA AGA TGG ATA ATC CTG GGA TTA 

2064 IPVGEXYKRWIZ&GL pBDKSgp»2 . seq 

20B4 ATC CCC GT6 GGC GAG ATC TAC AAG CGC TGG ATC ATC CTG GGC CTG 

1620 

1602 MKIVRMYSPTSILDI KL4-3 genbank.SEQ 
1602 MS AAA ATA GTA AGA ATG TAT AGC CCT ACC AGC ATT CTG GAC ATA 
2129 HRXVRMYS PTSILDZ pBCH39piik2 . seq 
2129 AAC AAG ATC GTG CGC ATS TAC.TCC CCC ACC TCC ATC CTG GAC ATC 

1650 1680 

1647 RQGPKEPFRDYVDRF NL4-3 genbank.SEQ 
1647 AGA CAA GGA CCA AAG GAA CCC TTT AGA GAC TPS GTA GAC CGA TTC 
2174 RQGPKEPFRDTVDRF pH]3Miigpn2 . seq 
2174 CGC CAG GGC CCC AAG GAG CCC TTC CGC GAC TAC GTG GAC CGC TTC 



1710 

1692 YKTLRACQASQEVKN KL4-3 gvnbank.SEfi 
1692 TXr AAA ACT C7A AGA GCC GAG CAA GCT TCA CAA &IS GTA AAA AAT 
22X9 YKTLRAEQA5QBVKM pEDM39pn2 . seq 
2219 TAC AAG ACC CTG CGC GCC GAG CAG GCC TCC CAG GAG GTA AAG AAC 



1740 1770 

» ' . - - -• 

1737 WMT E T I,LVQ HAH P DC £11,4-3 genbank.SEQ 
1737 TGG ATG ACA GAA ACC TTG TT6 GTC CAA AAT GCG AAC CCA GAT TGT 
2264 WHTBTLLVQNAHPOC pBDMHgpBi2 . seq 
2264 TGG ATG ACC GAGACCCT6CTGGTGCAGAACGCCAACCCCGACTGC 



1800 
I 

1782 KTZLXALGPGATLEE 10.4*3 genbank.SEQ 
1782 AAG ACT ATT TTA AAA GCA TTG GGA CCA GGA GCG ACA CTA GAA GAA 
2309 KTX LKAL6 P GAT^rEE pBI]IIBgpin2.seq 
2309 AAG ACC ATC CTG AAG GCC CTG GGC CCC GGC GCC ACC CTG GAG GAG 

, — — 1 

1S30 I860 

1827 MM TACQGVGGPGHKA in.4-3 genbank.SEQ 
1827 ATG ATG ACA GCA TGT CAG GGA GTG GGG GGA CCC GGC CAT AAA GCA 
2354 MilTACQGVGGPGHKA pEIMBgpm2. seq 
2354 ATG ATG ACC GCC TGC CACGGCGTGGGCGGCCCCGGCCAC AAG OCC 



Fig. 8C 
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ABgnmM« Report of Codon opSmizatkm (9ag).ME6. using Oustal mothod with PAM250 residua weight table. 

» — - 

1B90 

^ * 

1872 .qvLAEAMSQVTWPAT JfL4-3 genbank.SEQ 
1872 AGA GTT TTQ OCT GAA. GCA ATG AGC CAA GTA ACA AAT CCA GCT ACC 
2399 RV LA eAM 5 Q V T N PAT paDMEgpin2 . se^ 
2399 C6C GTG CTG OCC GAG GCC ATG TCC CAA GTC ACC AAC CCC GCC ACC 



1920 



19S0 



1917 IKIQKGNFRMCRKTV 

1917 ATA ATG ATA CAG AAA GGC AAT TTT AGG AAC CAA AGA AAG ACT GTT 

2444 IMIOKGMFRHQRXTV 

2444 ATC ATG ATC CAG AAG GGC AAC TTC CGC AAC CAG CGC AAG ACC GTG 



St4-3 genbank.SEC 
pHDMK9pta2 . seq 



1980 



1962 KCFNCGREGBlAKStc »L4-3 genbank.SSQ 

1962 AAG TGT TTC AAT TGT GGC AAA GAA GG6 CAC ATA GCC AAA AAT TGC 

2489 KCFHCGKEGBIAKNC pEDMHgpin2 . s eq 

2489 AAG TGC TTC AAC TGC GGC AAG GAG GGC CAC ATC GCC AAG AAC TGC 



2010 



2040 



2007 RAPRKKGCWKCGKEG HL4-3 genbank.SfiQ 

2007 AGG GCC CCT AGG AAA AAG GGC TGT TGG AAA TGT GGA AAG GAA GGA 

2534 RA PRKKGCWKCGKZG pHDMe9t)in2.seq 

2534 CGC GCC CCC CGC AAG AAG GGC TGC TGG AAG TGC GGC AAG GAG GGC 



1 

2070 



2052 BQHKOCTSRQANFLG i4L4-3 9e{Aank.SEQ 

2052 CAC CAA ATG AAA GAT TGT ACT GAG ASA CAG GCT AAT TTT TTA 6GG 
2579 BQHKDC7SRQAVri.G pBEK39pn2. seq 

2579 CAC CAG ATG AAA GAT TGT ACT SAG AGA CAG GCT AAT TTT TTA GGG 



J 

2100 



213C 



2C97 KIWPSHXGRPGNF-Q NL4-3 genbank.SSQ 

2097 AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 

2624 SIWFS HRG RPGlfFLQ pEDMBgtinZ^aeq 

2624 AAG AZC TGG CCT TCC CAC AAG GGA AGC CCA 6GS AAT TTT CTT CAG 



2160 



2142 SRPEPTAPPEESFRF HL4-3 genbank.SEQ 

2142 AGC AGA CCA GAS CCA ACA GCC CCA CCA GAA GAG AGC T?C ASS TTT 

2669 SRPEPTAPF3ESFRF pBDME9pni2.seq 

2669 AGC ACA CCA GAG CCA ACA GCC CCA CCA GAA SAG AGC TTC AGG TTT 



I 

2190 



I 

2220 



2187 6SETTTPSQXQEPXD HL4-3 genbank.SEQ 

2187 GG6 GAA GAG ACA ACA ACTCCCTCTCAGAAGCAlSGAGCCGAXAGAC 

2714 GEBTTTPSOKQE PIO pRDKH9piB2. seq 

2714 GGC SAA SAG ACA ACA ACT CCC TCT CAG AAG CAG GAG CCG ATA SAC 



Fig. 8D 
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Aiyment Report of Codoo opSmization (gag).MEG. using dustel metfiod mOi PAM250 residiM weight tablsL 

2250 

nil *""K~i~Z~J~P~r"*I S L R S t F C T" genbanJc.SEQ 
2232 AAG G3W CTG TAT CCT TEA GCT TCC CTC AG3V TCA CTC rrr GGC AGC 
27S9 KELtPLASLRSLFGS pBDMBqpiii2 s€^ 
2759 iUU; SAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CIC irr GCC A6C 



1 

2280 



2B04 D P S 5 <2 



2804 GAC CCC TCG TCA CAA TAA 



■^.^ pRI)MBgpm2 . s eq 



Fig. 8E 
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ABgnmenl Rsport of Codon Optnnaab'on (pol> MEG. 
209 0 

2087 rFRCDLAFPQGKARE 1IL4-3 genbank.SEQ 

2087 TTT TTT ACS GAA GAT CTC GCC TTC CCA CAA GGG AAG GCC AGG GAA — 

2085 FPRBD LAFPQGKARE pNL4-3.seq 

2085 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 

2612 FFRBDl.ArPQGKARE pHDKH9piii2 . seq 

2612 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 



2150 

I „ ■ . — ,,, 

2132 FS SBQT RANS PTRRE HL4-3 genbank.SEQ 

2132 TTT TCT TCA GAG CAG ACC ASA GCC AAC AGC CCC ACC AGA AGA GAG 

2130 FS S EQTRA»S?TRRE pHL4-3.»eq 

2130 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 

2657 FSSBQTRAMSFTRRE pHCMHgpmZ.seq 

2657 TT? TCT TCA GAG CAG ACC AGA GCC AAC ACC CCC ACC AGA AGA GAG 

1 ' 

2130 2210 

' ' : 

2177 LQVW6 R OK M 5 LS EAG ML4-3 genbank.SEQ 

2177 CTT CAG GTT TGG GGA AGA GAC .AAC AAC TCC CTC TCA GAA CCA GGA 

2175 LQ V» G R D H W S L S EAG pllL4-3.seq 

2175 err CAG CTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 

2702 L QVWGRDHHSLSBAG pHDMHgpn2 . seq 

2702 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 



2240 

2222 ADRQGTVSrSFPQIT IIL4-3 genbank.SEQ 

2222 GCC GAT AGA CAA GGA ACT GTA TCC TTT ACC TTC CCT CAG ATC ACT 

2220 ADRQGTVSFSFPQIT pHL4*3.seq 

2220 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CTT CAG ATC ACT 

2747 ADRQGTVSFSF PQIT pKI3MBgpa2 . seq 

2747 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 

2270 2300 ^ 

2267 LWQRPJ»VTIKIGGQ:» h:,4-3 genbank.SEQ 

2267 CTT TGG CAG CGA CCC CTC GTC ACA AIA AAG ATA GGG GGG CAA TTA - 

2265 LW Q RPI^VTIKIGGQI. pNL4>3.seq 

2265 CTT TGG CAG CGA CCC CTC GTC ACA AIA AAG ATA GGG GGG CAA TTA 

2792 tWQ RPLVTIK IGGQL pBDMBgpin2 . seq 

2792 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATC G6T GGC CAG CTG 

■ ■ " "I " " — 

2330 

2312 K2ALLDTGADDTVS2 MM-3 genbank.SEQ 

2312 AAG GAA GCT CIA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 

2310 KBALLDTGA0DTVLE pNL4-3.seq 

2310 AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 

2837 KB ALLD IGADDTVtE pHI!MHspm2 . seq 

2937 AAG GAG GCC CTG CTG GAC ACC GGC GCC GAC GAC ACC GTG CTG GAG 



Fig. 9 A 
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ABgnmen* Report of CbdonOptimi^ 



2360 



2390 



'2357 M H Z p i 5 i J J Z — ^ r. 

2BB2 EMMLPGBIfKpirMTf- 

2B82 ^^OMC<:rGCCC^CGCTGGI^aXJ^jSiiAlc^t^ PHn«gp«2..e9 



2420 

5~5 ? i K V V 6 . Q r D Q I r 

9*no *w ..^ --.^ . w " W 1. J* 



ZSZ7 IGGPISVRQYBOTi T _ 

2927 ArcGGCGGCTTCMc««srccGcawracs»cc^«rc(ws*|c 



2480 



2447 E r C G H K 



2III ^ ^ ^ ^ ^ ^ G^ aLv G^A I^A G^A CCT ""'^ 9«^*"k.SE0 

2445 ^^^^CT^TG^^aLg^A^AG^A^g^^T^A P^^^^^^e^ 

2972 BICGBKAlG'»'VLVn« ,^ 

2972 GAG ATC TGC GGC OU: AAG GCC Arc GGC ACC GTG CTG 6TG G^ cic ^"^^'^^ 



_^ 2510 

2492 ""t i V M I I G R N L L T n 



3017 



^*3' c T t M F i i s i i i — — — o — n~ 

»37 TGCACTTlAM,ITTCCC«TAOTCCT«T«GAeTBrAcLCTA ""^^^-'^ 
Z535 CTLMFPlspiETVPV 
2535 TGC ACT TTA AAT TTT CCC ATT AGT CC7 ATT GAG ACT GTA CCA GBV 
3062 CTLMFPispIBtv 

— TGC ACC CTG AAC TTC CCC ATC TCC CCC ATC GAG ACC GTG C^C G^G ^^^''^ 



3062 



2600 

2582 "k L K P G M G 



A.1.K PGMD6PX V V — o — — ;: — 

2582 AAATT*«6C«G«ArGGArGGCCaMASTTJilcSl*L<4 '^'^ ^'^'^'^ 
2580 K L K P G M D G P K V K O n d 

3107 m6cka«ccccscmgg«:«cccc««(w:aagcastLccc 
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ABgnmonl Report of Codon Pptinnzallon 0)ol>ME^^ 



residue weigMlabto. 



I 

2630 



I 

2660 



2627 L T i i K I K \ I v ET c T F" 

2627 TTC ACV CAA GAA AAA ATA AAA GCA TTA STA €AA TGT ACA GAA 
2625 tTB EK IXAtVElCTE 
2S25 TT6 ACA GAA GAA AAA A3A AAA GCA TO GTA GAA ATT TGt ACA SAA 
3152 LTBEKIK ALVCICT E 
3152 CTG ACC GAG GAG AAG ATC AAlS GCC CTG GTG GAG AXC TGC ACC GAG 



2690 



2672 
2672 
2670 
2670 
3197 
3X97 



HEKE GKIS. RIG T? e 5 ~ 
ATG GAA AAG GAA GGA AAA AST TCA AAA ATT GGG CCT GAA AAT CCA 

NSREGKISRIG PENP 
ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT 6AA AAT CCA 

KEKEGKISXiGPE irp 
ATC GAG AAG GAG GGC AAG ATC TCC AAG ATC 3GC CCC GAG AAC CCC 



2720 



— r — 
2750 



2717 



W TPVFAIKK K 0 S T if 
2717 TACAATACTCCAGTATTTCCCAaAAAGAAAAAAGACAiCrACTAAA 
2715 THTPVFAIKKrCDSTK 
2715 TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC ACT ACT AAA 
3242 TUTPVTAIKKKDSTK 
3242 TACAACACCCCCGTGTTCGCCAffCAAGAAGAAGGACTCCACCAAfi 



2780 



2762 W R K L V O F R E L 5 K 5 T ^ 

2762 TGG AfiA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAC AGA ACT CAA 
2760 WRKtVOFRELUKaT Q 
2760 TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAG AGA AC" CAA 
3287 WRKLVDcR£LHK aT*Q 
3297 TGG CGC AAG CTG GTG GAC TTC CSC GAG CTG AAC AAG CGC ACC CAG 



2810 



2S4C 



Z907 



DFWEVQtGrPaPAGL 
2807 GAT TTC TGG GAA GT7 CAA TTA GGA ATA CCA CAT CC? GCA GGG TTA 
2805 D Fir E V Q L 6 I P BPAG 
2805 GAT TTC TGG GAA GIT CAA TTA GGA ATA CCA CAT CC7 GCA GSG TTA 
3332 DFWEVQLGI P39A6L 
3332 GAC TTC TGG GAG GTG CAG CTG GGC ATC CCC CAC CCC GCC GGC CTG 



HL4-3 geobanfc.SEQ 
pHL4*3.seii 
pRIlHRgpni2.3eq 

104-3 9eid>anJc.SEQ 
pNL4-3.seq 
pR]}MHgpm2.3eq 

NL4-3 genb&nk.SEO 
pHL4-3.aeq ' 
pBIlHKgpniZ*seq 

NL4>3 genfcajxk.SEQ 

pRD>!H9p{ii2.seq 

m.-l-3 genbank.SEQ 



2870 



2352 



K Q ' K K S VTV L0V6DAY 
2852 AAA CAG AAA AAA TCA GTA ACA 6IA CTG GAT GTS GGC GAT GCA TAT 
2850 EQKK S VTVtOV.GDAy 
2850 AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 
3377 KQRK S VTVLDVGDAT 
3377 AAG CAG AAG AAG TCC GTG ACC GTG CTG SAC GTG GGC GAC GCC XAC 



i* pNL4-3.seq 
pUISfS9pm2. seq 

NI40 geabank.SEC 
plIL4-3.acq 
pR0MH9pn2 . seq 
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AlgnBwnt Report of CDdtm OpSnnzalian (pol)^^ 



I 

2900 



2930 
-J 



2897 PSVPLDKDFRXyTAF 

2897 TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TA7 ACT GCA TTT 

2895 FS VPLDXDFRXYTAF 

2895 TTT TCA GTT CCC TTA GAT AAA ©«: TTC AGG AAG TAT ACT GCA TTT 

3422 FSVPLDKDFRKYTAF 

3422 TTC TCC GTG CCC CT6 GAC AAG GAC TTC CGC AAG TAC ACC GCC TTC 



liI.4-3 genbank.SBQ 
pH0MR9pBi2 > seq 



2960 



.2942 TIPSlMllETPGlRrQ 
2942 ACC ATA CCT AGT ATA AAC AAT GAG ACA CCA GG6 ATT AGA TAT CAC 
2940 TIPSIMKBTPCIRTQ 
2940 ACC ATA CCT AGT ATA AAC AAT GAG ACA CCA GGG ATT AGA TAT CAC 
3467 TIPSINMETPGIRYQ 
3467 ACC ATC CCC TCC ATC AAC AAC GAG ACC CCC GGC ATC CGC TAC CAC 



2990 



3020 



2987 YHVLPQGWKGSPAIP 

2987 TAC AAT GTG CTT CCA CAG GGA TG6 AAA GGA. TCA CCA GCA ATA TTC 
2985 THVI^PQGWKGSPAIF 

29BS TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 
3512 YMVIPQGWKGSPAIF 

3512 TAC AAC GTG C?G CCC CAG GGC TGG AAG GGC TCC CCC GCC ATC TTC 



3050 



3032 QCSMTKILEPFRKQK 

3032 CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 
3030 OCSMTKILEPFRKON 

3030 CAG TGT AGC ATG ACA AAA ATC TTA GAC CCT TTT AlGA AAA CAA AAT 
3557 QCSHTKXLEPFRKON 

3557 CAG TCC TCC ATG ACC AAG ATC CTC GAG CCC TTC CGC AAG CAG AAC 



3077 
3077 
3075 
3075 
3602 
3602 



3122 



3080 
_i 



3110 



POIVIYQYMDDLYVG 
CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

PDIVIYQYNDDLYVG 
CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT CSVT TTG TAT GTA GG3V 

PDIVIYQYMDDLTVG 
CCC GAC ATC GTG ATC TAC CAG TAC ATG GAC GAC CTG TAC GTG GGC 



lfL4-3 genbank.SEQ 
pIlM-S. seq 
pBEMHgpaZ. seq 

NL4~3 genbank.SEQ 
pML4-3.seq 
pEnaigpin2 . seq 

NL4-3 genbank.SEO 
pKI.4-3.5eq 
pBCMUgpin2 . seq 

HM-3 genbaDk.SEQ 
pRL4-3.seq 
pBDHHgpm2. seq 



— r — 

3140 
I 



DieX GO aRTKlEE L ia4-3 genbank.SEQ 
3122 TCT GAC TTA GAA ATA GGG CAC CAT AGA ACA AAA ATA GAG GAA CTG 
3120 SDLEIGQHRTKIEBL pHL4-3.seq 
3120 TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 
3647 5 DLEIGQ HRTKIEEL pffOMHgpBZ . seq 
3647 TCC GAC CTG GAG ATC GGC CAG CAC CGC ACC AAG ATC GAG GAG CTG 



Tig. 9D 
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Algnmem Report of Cttdon OpSmbatton (pdXMEQ. using ClUslaline«icxf i«Ih PAM250 lesiA^ 



3no 



3200 

-JL_ 



3167 R.QHLLRWGFTT PDK 

3167 AG\ CAA OVT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAJW 

3165 RQHLLRWGFTTPDKK^ 

3165 AGACAACaTCTCTTGAGGTGGGGATTTACCACACCAGACAAAAAA 

3692 RQHLLRWGFTrPDTT 

3692 CGC CAG CAC CTG CTG CGC TGG GGC TTC ACC ACC CCC GAC AAG AAS 



pjn;4-3.5e<i 
pBEMHgpina.seq 



3230 



3212 
3212 
3210 
3210 
3737 
3737 



H Q 
CA7 CA6 

a Q 
CAT CAG 

H Q 
CAC CAG 



AAA CAACCrCCATTCCTTTGGArGGGTTATGAACTCCAT 

AAA GAA CCT CCA TTC CTT TGC ATG GGT TO GAA CTC CAT ^'^'^'''''^ 

AAG GftG CCC CCC TTC CTG ATG 4c Jc GftC C^G CAC '^^'P*^'"^ 



3257 
3257 
3255 
3255 
3782 
3782 



1 — 

3260 

P D 

CCT GAT 
P D 

CCT GAT 
P D 

CCC GAC 



3290 
-J 



KWTVQPivLPE ^ 5- 
AAA TGG ACA CTA CAG CCT AXA 676 CTG CCA GAA AA6 GAC 

AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAC GAC 

J^WTVQPiVLPEKD 
AAG TGG ACC GTG CAG CCC ATC GT6 CTG CCC GAG AAG GAC 



1 

3320 



3302 
3302 
3300 
3300 
3827 
3827 



s w 

AGC TGG 
5 0 

AGC TGG 
5 W 

TCC TGG 



^ V N D r Q i I V G K Z — iT 

ACT GTC AAT GAC ATA CAG AAA TTA CTG GGA AAA TT6 AAX 
^VNDIQKLVGKLH 

ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 
TVMDIQKLVGKLM 

ACC GTG AAC GAC ATC CAG AAG CTG CTG GGC AAG CTG AAC 



tnA-3 genbank.SEQ 

pNL4-3.aeq 

pRQMRgpin2.seq 

HL4~3 genback.SEC 
plfL4-3.5eq 
pBSMHgpm2. seq 



1 

3350 



J 

3380 



3347 
3347 
3345 
3345 
3872 
3872 



W A 
TGG GCA 

V A 
TGG GCA 

« A 
TGG GCC 



A^ cL A^T tIt gIg A^T A^ G^ aL T?A T^ 

•SQIYAGIKVRQLC p»L4-3 sen 
AfiTCaGATTTATGCAGGGATTAAACTAACGCAATTATGr ' 

SQIYA .ciKVRQ LC pHl»!Hgnn2.aea 
TCC CAG ATC TAC GCC GGC ATC AAA GTC CGC CAG CTG TGC 



3410 



3392 JCLLR6TKALTE V^~V p l " 

3392 AAACTTCTTAGGGGAACCAAAGCACTAACAGBUVCTACTACCACTA 
3390 KLLRGTKALTEVVPL 
3390 AAACTTCTTAGGGGAACCAAAGCACTAACAGAAGrACtACCACTA 
3917 KLLRGTKALTBVVPL 
3917 AAG CTG CTG CGC GGC ACC AAG GCC CTG ACC GAG 616 GTC CCC CTG 



NL4-3 genbank.SEQ 

pNL4-3.seq 

pai»IBgpai2,3eq 
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ASgnmenl Report of Codon ^>6miZ9tf^ 



3440 



I 

3470 



3437 TEEAELBLAEMREIL 
3437 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG AXT CTA 
3435 TEBAELBLAENREIL 
3435 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 
3962 TEBAEL ELASITREZL 
3962 ACC GAG GAG GCC GAG CTG GAG CTG GCC GAG AAC CGC GAG ATC CTG 



3500 



3482 KEPVHGVYYOPSKDL 
34B2 AAA GAA CC6 GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 
3480 KSPVB 6VYYDP5 KDL 
3480 AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 
4007 BEPVBGVYTDPSR DL 
4007 AAG GAG CCC GTG CAC GGC GTG TAC TAC GAC CCC TCC AA6 GAC CTG 



—I 

3530 



I 

3560 



3527 lAEIQKQGO GQWTYQ 
3527 ATA GCA GAA ATA CAG AAG CAG GG6 CAA GGC CAA TGG ACA TA? CAA 
3525 lABIQRQGQG QVTTYQ 
3525 ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 
4052 lAEZ .QKQGQGQirTYQ 
4052 ATC GCC <S\G ATC CAG AAG CAG GGC CAG GGC CAG TGG ACC TAG CAG 



3590 



3572 lYQBPrK HLKTGKYA 
3572 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 
3570 I YOBPFKHLKTGKTA 
3570 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 
4097 lYQBPFKHLKTGKYA 
4097 ATC TAC CAG GAC CCC TTC AAG AAC CTG AAG ACC GGC AAA TAC GCC 



NL4-3 genbaok.SEQ 
pHL4-3.5eq. 

ML4-3 genbank.SEQ 
pHIMH9pai2.aeq 

104-3 9e2ibank.SBQ 
p)IL4'3.seq 
pHCKHgpnZ. seq 

HL4-3 genbank.5]^ 
pNL4'3.seq 
pKDKB9pin2 . seq 



1 

3620 



3650 



NL4-3 genbank.SEQ 



3617 RMRGARTNDyKQLT 

3617 AGA ATG AAG GGT GCC CAC ACT AAT GAT Gfs AAA CAA TTA ACA GAG 
3615 RMRGABTNDVKgLTE pNL4-3.5eq 

3615 AGA ATG AA6GGTGCCCACACTAATGATGTG AAA CAA TXA ACA GAG 
4142 RMRGABTBOVKQLTE pBIME9pin2 . seq 

4142 COC ATG AAG GGC GCC CAC ACC AACGACGTGAAGCAGCTGACCGAG 



3680 



3662 AV Q K I AT B S I VI W G K HL4-3 genbank.SEQ 

3662 GCA GIA CAA AAA AXA GCC ACA GAA AGC ATA GTA ATA TGG GSA AAG 

3660 AVQKIAT ESXVXWGR pNU-3.seq 

3660 GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

4187 AVgjCIATES^IVIWGIC pBOMHgpioZ.seq 

4187 GCC GTG CAG AAG ATC GCC ACC GAG TCC ATC GTG ATC TGG GGC AAG 



Fig. 9F 
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ATignmonJ Report of Codon Op6mi2atfen (pol).MEG, using Ctetal meM vvilh PAM250 f«ddU8 weSgM tablo. 

3710 3740 

3707 TPKP KLPIQKETWEA 2IL4-3 genbank.SEQ 

3707 ACT CCT AAA TTT AAA TTA CCC ATA CAA AAG GAA ACA TGG GAA GCA 

3705 TPKFKLPIQKETWE A pHL4-3. seq 

3705 ACT CCT AAA. TTT AAA TTA CCC ATA C3\A AAG GAA ACA TGG GAA GCA _ 

4232 T PKFXLPIQKET WEA pHDMHgW.aeq 

4232 ACT CCC AAG TTC AAG CTG CCC ATC CAG AAG GAG ACC TGG GAG GCC 

3770 



3752 WBTEYWQATWIPBWB irL4-3 genbank.SE© 

3752 TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT SAG TGG GAG 

3750 BWT EYWQ ATWIPE BE pNL4-3.3eq 

3750 TGG TGG ACA GAG TAT TG5 CAA GCC ACC TGG ATT CCT GAG TGG GAG 

4277 WWTBTWQATW IPEWE pHDMEgpii.2.»eq 

4277 TGG TGG ACC GAG TAC TGG CAG GCC ACC TGG ATC CCC GAG TGC GAG 

} : T — i — _ 

3M0 3930 



pHCKHgpaZ.seq 



3797 FVHTPPLVKLlf YQLE NL4-3 genbank . SEQ 

3797 TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

3795 FVHT P PLVKLWYQLE pML4-3.seq 

3795 TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

4322 FV KTPPLV KLWYQLE pHI)MHgpm2 .seq 

4322 TTC GTG AAC ACC CCC CCC CTG GTG AAG CTG TGG TAC CAG CTG GA6 

3B60 

^ 

3842 KEPIX6ABTFYVDSA lfL4-3 genbank.SEC 

3842 AAA GAA CCC ATA AIA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 
3840 KEPIIGAETFYVDGA pNL4-3.seq 

3840 AAA GAA CCC ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 
4367 KEPIIGAETFYVDGA 

4367 AAG GAG CCC ATC ATC GGC GCC GAG ACC TTC TAC GTG GAC GGC CCC 

1 j_ 

3890 392C 

3887 AHRETKLG KAGYVTD ML4-3 genbank.SEQ 

3887 GCC AAT AGG GAA ACT AAA TTA GGA AAA GCA GGA TAT GTA ACT GAC 

3885 AWRETKLG KAGYVTD pNL4-3.seq 

3885 GCC AAT AGG GAA ACT AAA TTA GGA AAA GCA GGA TAT GTA ACT GAC 

4412 AHRETKLGKAGYVTO pHI»«gpo:2.»eq 

4412 GCCAACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGAC 

3950 

3932 RGRQKVvp:.tdtthq ML4-3 genbank.SSQ 

3932 AGA GGA AGA CAA AAA GTT GTC CCC CTA AC6 GAC ACA ACA AAT CAG 

3930 R GRQKVVPLTDTTHQ pNL4-3*8eq 

3930 ASA GGA ASA CAA AAA GTT GTC CCC CIA ACG GAC ACA ACA AAT CAG 

4457 RGRQKVVPLTDTTHQ pBSHBgpnZ. seq 

4457 CGC GGC CGC CAG AAG GTG GTG CCC CTG ACC GAC ACC ACC AAC CAG 
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ABgnnranl Report of Codon Qp&mzstian ^XMEG, using ClUstal method vriff) PAM250 residue table. 



3977 
3977 
3975 
3975 
4502 
4502 



1 

3980 
I 



4010 
-J 



KTELQAI KLAI.QDSC HL4-3 genbank.SEQ 
AAG ACT GAG TTA CWV GCA AIT CAT CTA GCT TTG CAC GAT TCG GGA 

K T E LQA I H LALQO s G pNI.4-3.seq 
AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

KTELQAZ HLALQD5G pBDMHgpmZTseq 
AAG ACC GAG CTG CAG GCC ATC CAC CTG GCC CT6 CAA GAC TCC 6GC 



4040 



4022 
4022 
4020 
4020 
4547 
. 4547 



4067 

4067 
4065 
4065 
4592 
4592 



41X2 
.4112 
4110 
4110 
4637 
4637 



LEVNlVTDSQYAtGI 
TTA GAA GTA AAC AtA GTG ACA GAC TCA CAA TAT GCA TTG GSA ATC 

LEVNZVTDSQYAIGI 
TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

I. E VN I VT D S Q TAL G I 
CTG GAG GTG AAC ATC GTG ACC GAC TCC CAG TAT GCA TTG GGC ATC 



4070 



I 

4100 



IQAQPDX5B5ELVSQ 
ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC ACT CAA 

ZQAQPDKSESELVSQ 
ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 

IQAQ- PDKSESELVSQ 
ATC CAG GCC CAG CCC GAC AAG TCC GAG TCC GAG CTG GTG TCC CAG 



I ' 
4130 



Z ZEQLIKKSKVYI. AW 
AIA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 

Z Z E QLI KKEKVYLA? 
ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TCG 

IXBOLIKKEEVTLAS 
ATC ATC GAG CAG CTG ATC AAG AAG GAG AAG GTG TAC CTG GCC TGG 



1 

4160 



4190 



4157 VPAH KGI GGN2QVD6 

4157 GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT GGG 

4155 VFABXGXG GNEQVOK 

4155 GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT AAG 

4682 VPAHXGZGGNEQVDK 

46B2 GTG CCC GCC CAC AAG GGC AXC GGC GGC AAC GAG CAG GTG GAC AAG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHI]HS5pia2.seq 

KL4-3 genbank.SBC 
pNL4-3.seq 
pfWMltgpmZ . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pBSHRgpin2 . seq 

Nt4-3 qenbank.SEQ 

p}IL4-3.seq 

pBDMHgpm2.8eq 



I 

4220 



4202 LV3AGZRKVLFLD6I ia4-3 genbanJt. SEQ 

4202 TTG GTC AGT GCT GGA ATC AGS AAA GTA CtA TTT TTA GAT GGA AXA 

4200 LVS A G Z RXVL PL DG Z pilL4-3.seq 

4200 TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TT7 TTA GAT GGA ATA 

4727 LVSAGIRKVLFLD6I pHDMHgpin2 . seq 

4727 CTG GTG TCC GCC GGC ATC CGC AAG GTG CTG TTC CTG GAC GGC ATC 
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ASgnmem Repari ofCbdon Optimizatton {^MEG. using Chistal method wHtti PAM2S0 m^uo weight taUau 



>' ' ' 
4250 



4280 



4247 DKAg£BHEKYH5HHR 

4247 GAT AA6 GCC CAA GAA GAA CAT GAG AAA TAT CAC AG? AAT TG6 AGA 
4245 DKAQEBBERYESNffR 

4245 GAT AAG GCC CAA CfA GAA CAT GAG AAA TAT CAC AGT AAT TGG AGA 
4772 DXAQZERBRYHSNWR 

4772 GAC AAG GCC CAG GAG GAG CAC GAG AAG TAC CAC TCC AAC TGG CGC 



4310 
t 



4292 AMAS D FML E PVVAKS 

4292 GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 

4290 AMAS DFKLPFVVAKE 

4290 GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 
4817 AMAS DFNLPPVVAKE 

4817 GCC ATG GCC TCC GAC TTC AAC CTG CCC CCC GTG GTG GCC AAG GAG 



4340 



1 

4370 



4337 IVASCDKCQLKGBAH 

4337 ATA GTA GCC AGC TGT GAT AAA TGT CA6 CTA AAA GG6 GAA GCC ATG 

4335 IVAS CDKCQLKGEAM 

4335 ATA GTA GCC AGC TGT GAT AAA TGT CAG CXA AAA GGG GAA GCC ATG 

4862 IVASCDKCQLKGEAM 

4862 ATC GTG GCC TCC TGC GAC AAG TGC CAG CTG AAG GGC GAG GCC ATG 



1 

4400 



4382 HGQVDCSPGI WQLDC 

4382 CAT GGA CAA GTA GAC TGT AGC CCA 6GA ATA TGG CAG CTA GAT TGT 

4380 HGQV DCSP GIWQLDC 

4380 CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA SAT TGT 

4907 BGQVOCSPGI.WQLDC 

4907 CAC GGC CAG GTG GAC TGC TCC CCC GGC ATC TGG CAG CTG GAC TGC 



4430 



4460 



4427 THLEGKVI LVAVHVA 

4427 ACA CAT TTA GAA GGA AAA GTT ATC TTG GZA GCA GTT CAT GTA GCC 

4425 T R L E G KVI LVAV BVA 

4425 ACA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 

4952 TBI.EGXVI LVAVBVA 

4952 ACC CAC CTG GAG GGC AAG GTG ATC CTG GTG GCC GTG CAC GTG GCC 



NL4-3 genbaok.SEQ 
pHL4-3.aeq 
pBIMHgpm2 . smq 

NL4-3 genbanJc.SEQ 
pNL4-3. seq 
pHIMBgF!iii2ise<i 

1IL4-3 genbank.SEQ 

p!fL4-3.seq 

pH9KH9pDi2.seq 

ia4-3 genbank.SESQ 

pNL4-3.seq 

pBOMRgpaiZ.seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pflDMB9p3ii2.seq 



4490 

L- 



4472 SG TIEAEVIPASTGe HM-3 genbauk.SEC 

4472 AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

4470 SGYIEAEVIPAETGa plfl.4-3.seq 

4470 AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

4997 SGYI BAEVIPAETGQ pBt91Rgpn2 . seq 

4997 TCCGGCTACATCGAGGCCGAGGTG ATC CCC GCC GAG ACC GGC CAG 
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AG^niMnl Report of Cocfon Opftnizalien 



4517 
4517 
4515 
4515 
5042 
5042 



t " 

4520 
t 



i 

4550 



ETAYFLLKLAGRWPV 
GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 

ETAy FLiKLAGRSTPV 
GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 

ETAyFLLKI,AGRWPV 
GAG ACC GCC TAC TTC CTG CTG AAG CTG GCC GGC CGC TGG CCC GTG 



4580 



4562 KT VHTDMG5KFTSTT 
4562 AAA ACA GTA CAT ACA GAC AAT GGC AGO AAT TTC ACC AST ACT ACA 
4560 KTVBTDNGSHFTSTT 
4560 MH ACA GTA CAT ACA GAC AAT GGC AGC MS TTC ACC PiGT ACT ACA 
5087 KTVHTDIffGSHFTSTT 
5087 AAG ACC GTG CAC ACC GAC AAC GGC TCC AAC TTC ACC TCC ACC ACC 



4610 



4607 
4607 
4605 
4605 
5132 
5132 



4652 
4652 
4650 
4650 
5177 
5177 



4640 
I 



VKAACVfTAGIKQEFG 
GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

VK. AACWWAGZ.KQEFG 
GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CA6 GAA TTT GGC 

VKAACWfTAGIKQCFG 
GTG AAG GCC GCC TGC TGG TGG GCC GGC ATC AAG CAG.GAG TTC GGC 



4670 



IPYRPQSQGVIESMH 
ATT CCC TAC AAT CCC CAA AGT CAA GGA GXA ATA GAA TCT AT6 AAT 
IPYNPQSQ6VZESNH 

ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 

IPYNPQSQGVIE3NN 
ATC CCC TAC AAC CCC CAG TCC CA6 GGC GTG ATC GAG TCC ATG AAC 



4700 



4730 



4697 
4697 
4695 
4695 
5222 
5222 



KELRKIIGQVRDQAE 
AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 

KELKKItGQVRDQ AE 
AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GA7 CAG GCT GAA 

K EL K K I Z GQV RDQAE 
AAG GAG CTG AAC AAG ATC ATC GGC CAA GTC CGC GAC CAG GCC GAG 



NI.4-3 genbaLttk.SEQ 

pirL4-3.5eq 
pH]aMBgpm2 . seq 

pNL4-3.seq 
pHDMRgpRi2.seq 

NL4-3 genbank.SEQ 
pNL4~3.seq 
pHDMBgpai2 . seq 

llI>4'-3 genbanJc.SEQ 

pHL4-3.9eq 

pRZ)MIIgpm2.5eq 

NL4-3 genbank.SEQ 
pNL4-3.5eq 

seq 



1 — - 

4760 



4742 BLKTAVQMAV FZHNF 1IL4-3 genbank.SEQ 

4742 CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 

4740 BIiKXAVQHAVFIHIir pNM-S.seq 

4740 CAT CTT AAG ACA GCA GTA CAA ATG GCA CTA TTC ATC CAC AAT TTT 

5267 BLKTAVQHAVrZBlTF pBEMBgpmZ.seq 

5267 CAC CTG AAG ACC GCC GTG CAG ATG GCC GTG TTC ATC CAC AAC TTC 
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ABgnment Report of Codon Op«miza!ioo (pOyMEQ, using Ckistsd method wRh PAM250 reddu* wdghl tabte. 



I 

4790 



I 

4B20 



4787 K R K G G I G G Y i i R ~ 

4737 AAA AGA AAA GGG GGG ATT GGG GGG TAC ACT GCA GGG GAA AGA ATA 
4785 KRKGG. IGGYSAGBRI 
4785 AAA AGA AAA GGGGGGATTGGGGGGTACA6TGCAGGGGAAAGAAIA 
5312 KRKGGIGSYSAGBRI 
5312 AAGCGCAAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATC 



104-3 genbank.SEQ 

pNL4>3.seq 
pBDtfBgpm2 . s eq 



46S0 



4932 V D I I A T D J Q T K i 1 n ?~ «f»j » 

4832 CTAGACAtAAIAGCAACAGACATACAAACTAAAGAATTkc^JlA ^enbonk.SEQ 

4830 VOI lATDIOTKBLOK mtj--» 

4830 CTAGACAIAATA6CAACAGACAIAOVAACTAAA6AA1TACAAAAA ^^'^'"'"'^ 

5357 gJ^G^CA^CMCG^CA^CG^A^CC^aLaIJg pHI«MRgp»2.«q 



4880 



4910 



4877 QITXIQHFRVYYRDS" 

4877 CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 
4875 QITKIQMFR .VYTRDS 

4875 CAA ATT ACA AAAATTCAAAATrTTCSGGrrTATTACAGGGaCAGC 

5402 QITKIQNFRVYrHDS 

5402 CAG ATC ACC AA6 ATC CAG AAC TTC CGC GTG TAC TAC CGC GAC TCC 



4940 



4922 R D P V W K G ? A R I. H K 6 

4922 AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GCT 

4920 RDPVWKGPAXLLWKG 

4920 AfiAGATCCAGTTTGCAAAGGACCAGCAAAGCTCCTCTGGAAAGGT 

5447 RDrv trKGPAKLr. WKG 

5447 CGC GAC CCC GTG TGG AAG GGC CCC GCC AAG CTG CTG TGG AAG GGC 



4970 



5000 



4967 E G A V V I Q 5 ^ i ^""^ ^ 7- 

4967 GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 
4965 EGAVVIQDN SDIKVV 

4965 GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ASA AAA GtA GTG 
5492 E GAVVIQDMSDIKVV 

5492 «e GGC GCC GTG GTG ATC CAC GAC AAC TCC GAC ATC AAG GTG GTG 

503 0 

5012 ~P R R S A i S 5 Y G K Q M~ 

5012 CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 
5010 PRRKAKIrROTGKQK 
SOlO CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT lAT GGA AAA CAG ATG 
5537 PRRKAKirROYGKCM 
5537 CCC CGC CGC AAG GCCAAGATCATCCGCGACTACGGCAAGCAGATG 



IIL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpin2 . secf 

NL4-3 genbank.SEQ 
pilL4-3.seq 
pKDKBgpra2. seq 

Wt4-3 genbank.SEQ 
pNL4-3.seq 
pB0MB9pn2. seq 

NL4*3 genbank.SEQ 
pNL4-3.3eq 
pREMBgFBn2. seq 
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Algnmert RaportcrfCodon Optfmizalfan OwlXMK 



* 1 
5060 5090 

5057 AGDDCVA SRQDED ^r.i 
5057 6CA GGT GM GAT TCt GTS GCA AST ASA CAC GAT GAG GAT TRA 
5055 AGDDCVASHQDBD 
5055 GCA GGT GAT GAT TGT GTG GCA AGT AGA CAS GAT GAG CAT SIUl' 



pia4-3. seq 



303d eCA GGT GAT GAT TGT GTG GCA AGT AGA CAS GAT GAG CAT SfOi. 

5582 AGDDCVASRQDBD^: pm)HEgnii2 . sea 
5582 <XC GGC GAC GAC TGC GTG GCC TCC CGC CAG GAC GAB GAC jaiit W« «l 
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AGCTTGGCCC ATTGCATAC6 TTGTATCCAT ATC&TAATAT GTACATTTAT ATSGGCrSCXt 60 

GTCCAACATT ACCGCCATGT TGACATTGAT TATTGACTAG TTATTAATA6 TAATCAATZA 120 

CGGGGTCATT AGTTCATAGC CCATATATGG AfiTTCCGCGT TACATAACTT ACGGTAAAT6 180 

GCCCGCCTGG CTGACCGCCC AACGACCCCC GCCCATTGAC GTCAATAAT6 ACGTATGTTC 240 

CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA 300 

CTGCCCACTT GGCAGTACAT CAAGTGTATC ATATGCCAAG TACCKTCCCCT ATTGACGTCA 360. 

AT6ACG6TAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCITATGG GACTTTCCTA 420 

CTT6GCAGTA CATCTAC6TA TTAGTCATCG CTATTACCAT OGTGATGCGG TTTTGGCAGT 480 

ACATCAATG6 GC GTGGATA G CGGTTTGACr CACG6G6ATT TCCAAGrCXC CACCCCATTG 540 

ACGTCAATGG GAGTTTGTTT TG6CACCAAA ATCAACG6GA CTTTCCAAAA T6TC6TAACA 600 

ACTCCGCCCC ATTGAC6CAA ATGGGCGGTA 6GCGTGTACG GT6GGAGGTC TATATAA6CA 660 

GAGCTCGTTT AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC 720 

ATAGAAGACA CCGGGACCGA TCCAGCCTCC CCTCGAAGCT GATCCTGAGA ACTTCAGGGT 780 

GAGTCTATGG GACCCTTGAT GTTTTCmC CCCTTCTTTT CTATGGTTAA GTTCATGTCA 840 

TAGGAAGGGG AGAAGTAACA GGGTACACAT ATTGACCAAA TCAGGGTAAT TTTGCATTTG 900 

TAATTTTAAA AAATGCTTTC TTCTTTTAAT ATACTTTTIT GTTTATCTTA TTTCTAATAC 960 

TTTCCCTAAT C T C T TTC TTT CAGGGCAA7A ATGATACAAT GXATCATGCC TCTTTGCACC 1020 

ATTCTAAAGA ATAACAGTGA TAATTrCTGG GTTAAGGCAA TAGCAATATT TCTGCATATA 1080 

AATATTTCTG CATATAAATT GTAACTGATG TAAGAG6TTT CATATTOCTA ATAGCAGCTA 1140 

CAATXrCAGCT ACCATTCTGC TTTTATTTTA TGGTTGGGAT AAGGCTGGAT TATTCTGAGT 1200 

CCAAGCTAGG CCCTTTTGCT AATCATGTTC ATACCTCTTA TCTTCCTCCC ACAGCTCCTG 1260 

GGCAACGTGC TGGTCTGTGT GCTGGCCCAT CACTTTGGCA AAGAATTCTA GACTGCCATG 1320 

GGCGCCCGCG CCTCCGTSCT GTCCGGCGGC GAGCTGGACA AGTGGGAGAA GATCCGCCTG 1380 

CGCCCCGGCG GCAAGAAGCA GTACAAGCTG AAGCACATCG TGTGGGCCTC CCGCGAGCTG 1440 

GAGCGCTTCG CCGTGAACCC CGGCCTGCTG GAGACCTCCG AGGGCTGCCG CCAGATCCTG 1500 

GGCCAGCTGC AGCCCTCCCT GCAAACCGGC TCCGAGGAGC TOCGCTCCCT GTACAACACC 1560 

ATCGCCGTGC TGTACT6CGT GCACCAGCGC ATCGACGTGA AG6ACACCAA GGAGGCCCT6 1620 

GACAAGA?CG AGGAGGAGCA GAACAAGTCC AAGAAGAAG6 CCCAGCAGOC CGCCGCCGAC 1680 

ACCGGCAACA ACTCCCAGGT GTCCCAGAAC TACCCCATCG TOCAGAACCT GCAGGGCCAG 1740 

ATGGTGCACC AGGCCATCTC CCCCCGCACC CTGAACGCCT GGGTGAAGGT GGTGGAGGAG 1800 

AAGGCCTTC T CCCCCGAAGT CATCCCCATG TTCTCCGCCC TCTCCGAGGG CGCCACCCCC 1860 

CA6GACCT6A ACACCATGCT GAACACCGT6 GGCGGCCACC AGGCCGCCAT GCAGATGCTG 1920 

AAGGAGACCA TCAACGAGGA GGCC0CCGA6 TGGGACCGCC TGCACCCCGT GCACGCCGGC 1980 

CCCATCGCCC CCGGCCAGAT GCGCGAGCCC CGCGGCTCC6 ACATCGCC66 CACCACC7CC 2040 

ACCCTGCAAG AGCAGATCG6 CTGGAT6ACC CACAACCCCC CCATCCCCGT GGGCGAGATC 2100 

7ACAAGCGCT GGATCATCCT GG6CCT6AAC AA6ATCGT6C GCATGTACTC CCCCACCTCC 2160 

ATCCTGGACA TCCGCCAGGG CCCCAAGGAG CCCTTCCGCG ACTACGTGGA CCGCTTCTAC 2220 

AAGACCCTGC GCGCCGAGCA GGCCTCCCAG GAGGTAAAGA ACTGGATGAC CGAGACCCTG 2280 

CTGGTGCA6A ACCCCAACCC CGACTGCAAG ACCATCCTGA AGGCCCTGGG CCCCGGCGCC 2340 

ACCCTGGAGG AGATGATGAC CGCCTGCCAG GGCGTGGGCG GCCCCGGCCA CAAGGCCCGC 2400 

GTGCTGGCCO AGGCCATGTC CCAAGTCACC AACCCCGCCA CCATCATGAT CCAGAAGGGC 2460 

AACTTCCOCA ACCAGCGCAA GACCGTGAA6 ?GCTTCAACT GC6GCAAGGA GGGCCACATC 2520 

GCCAAGAACT GCCGCGCCCC CCGCAAGAAG GGCI OC T GGA AGTGCG6CAA 0GAGG6CCAC 2580 

CAGAICAAAG ATTGTACTGA GAGACAGGCT AATTTTTTAG GGAA6ATCT0 GCCTTCCCAC 2640 

AA6G6AAG6C CA66GAATTT TCTTCAGAGC AGACCAGAGC CAACAGCCCC ACCM3AAGA6 2700 

AGCTTCAGGT TT666GAAGA GACAACAACT CCCTCTCA6A AOCAGGAGCC GATAGACAAG 2760 

6AACTGTATC CTTTAGCTTC CCTCAGATCA C r c mXiG CA GCGACCCCTC GICACAATAA 2820 
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A6ATC6GT66 CCAGCTGAA6 GAGGCCCTGC TGGACACCGS CGCC6ACGAC ACCG7GCTGG 2860 

CCTGCCCGGC CGCT6GAAGC CCAAGATGAT CG6C6GCATC GGCG6CTTCA 2940 

TCAAAGTCCG CCAGTACGAC CAGATCCTGA TCGAGATCTG CGGCCACAAG GCCATCGGCA 3000 

CCGTGCTGGT GGGCCCCACC CCCGTGAACA TCATCGGCCG CAACCTGCTG ACCCAGATCG 3060 

GCTGCACCCT GAACTTCCCC ATCTCCCCCA TCGAGACCGT GCCCGTGAAG CTGAAGCCCG 3120 

GCATGGACGG CCCCAAAGTC AAGCAGTGGC CCCTGACCGA GGAGAAGATC AAGGCCXTGG 3180 

TGGAGftTCTG CACCGAGAT6 GAGAAG6A6S GCAAGATCTC CAAGATCGGC CCCGAGAACC 3240~ 

CCTACMCAC CCCCCTGrrC GCCATCAAGA AGA AGGRCTC CACCAAGTG6 CGCAA6CTG6 3300 

TGGACTTCC6 C6AGCTGAAC AAGCGCACCC AGGACTTCT6 6(9W6GTGCA6 CTG66CATCC 3350 

CCCACCCCGC CG6CCT6AAG CAGAAGAA6T CCGTGACCGT GCTGGACGTG GGCGACGCCT 3420 

ACTTCTCCGT GCCCCTGGAC AAGGACTTCC GCAAGTACAC CGCCTTCACC ATCCCCTCCA 3480 

TCAACAACGA GACCCCCGGC ATCCGCTACC AGTACAACGT GCTGCCCCflC GGCTGGAAGG 3540 

GCTCCCCCGC CATCTTCCAC TGCTCCATGA CCAAGATCCT GGAGCCCTTC CGCAAGCAGA 3600 

ACCCCGACAT CGTGATCTAC CAGTACATGG ACGACCTGTA CGTGGGCTCC GACCTGGA6A 3660 

TCGGCCAGCA CCGCACCAAG ATCGAGGAGC TGCGCCAGCA CCTGCTGCGC TGGGGCTTCA 3720 

CCACCCCCGA CAAGAAGCAC CAGAAGGAGC CCCCCTTCCT GT6GATGGGC TACGAGCTGC 3780 

ACCCCGACAA GTGGACCGTG CA0CCCATC6 TGCTGCCCGA GAAG6ACTCC TG6ACCGIGA 3840 

ACGACATCCA GAAGCTGGT6 GGCAAGCTGA ACTGGGCCTC CCAGATCTAC GCCGGCATCA 3900 

AAGTCCGCCA GCTGTGCAAG CTGCTGCGCG GCACCAAGGC CCTGACCGAG GTGGIGCCCC 3960 

TGACCGAGGA GGCXGAGCTG GAGCTGGCCG AGAACCGCGA GATCCTGAAG GAGCCCGTGC 4020 

ACGGCGTGTA CTACGACCCC TCCAAGGACC TGATCGCCGA GATCCAGAAG CAGG6CCAGG 4080 

GCCAGTGGAC CTACCAGATC TACCAGGAGC CCTTCAAGAA CCTGAAGACC GGCAAATACG . 4140 

CCCGCATGAA GGGCGCCCAC ACCAACGACG TGAAGCAGCT GACCGAGGCC GTGCAGAAGA 4200 

TCGCCACCGA GTCCATCGTG ATCTGGCGCA AGACTCCCAA GTTCAACCTG CCCATCCAGA 4260 

AGGAGACCTG GGAGGCCT6G TG6ACCGAGT ACTGGCAGGC CACCTGGATC CCCGAGTGGG 4320 

AGTTCGTGAA CACCCCCCCC CTG G TG A AGC TGTG6TACCA 6CTG6AGAAG GAGCCCATCA 4380 

TCGGCGCCGA GACCTTCTAC GTG6AC6GCG CC6CCAACCG CGAGACCAAG CTGGGCAAG6 4440 

CCGGCTACGT GACCGACCGC GGCCGCCAGA AGGTGGTGCC CCTGACCGAC ACCACCAACC 4500 

AGAAGACCGA GCTGCAGGCC ATCCACCTGG CCCTGCAAGA CTCCGGCCTG GAGGTGAACA 4560 

TCGT6ACCGA CTCCCAGTAT GCATTGGGCA TCATCCAGGC CCAGCCCGAC AAGTCCGAGT 4620 

CCGAGCTGGT GTCCCAGATC ATCGAGGAGC TGATCAA6AA GGAGAAGGTG TACCTGGCCT 4680 

GG6TGCCCGC CCACAA6GGC ATCGGCGGCA ACGAGCA6GT GGACAAGCTG GTGT CC GCCG 4740 

OCATCCGCAA GGl'GCI'GVi' C CTGGACGGCA TCGACAAGGC CCA0GA6GA6 CACGAGAAGT 4800 

ACCACTCCAA CTGGCGCGCC ATGGCCTCC6 ACTTCAACCT GCCCCGCGTG GT6GCCAAG6 4860 

AGATCGTGGC CTCCTGCGAC AAGTCCCAGC TGAAG06C6A GGCCA70CAC GGCCA6GTG3 4920 

AC T GCTCCCC CGGCATCTG6 CAGCT66ACT 6CACCCACCT 6GAGGGCAAG 0TGATCCTG6 4960 

TGGCCGTGC\ CGTGGCCTCC GGCTACATCG AGGCCGAGGT GATCCCCGCC GAGACCGGCC 5040 

AGGAGACCGC CTACTTCCTG CTGAAGCTGG CCGGCCGCTG GCCCGTGAAG ACCGTGCACA 5100 

CCGACAACGG CTCCAACTTC ACCTCCACCA CCGTGAAGGC CGCCTGCTGG TGGGCCGGCA 5160 

TCAAGCAGGA GTTCGGCATC CCCTACAACC CCCAGTCCCA GGGCGTGATC GAGTCCATGA 5220 

ACAA6GAGCT GAAGAAGATC ATCGGCCAAG TCCGCGACCA GGCCGAGCAC CTGAAGACCG 5280 

CCGTGCA6AT GGCCiAWfC ATCCACAACT TCAAGCGCAA GGGCGGCATC GGCGGCTACT 5340 

CCGCCGGCGA OCGCATCG7G GACATCATCG CCACCGACA'? CCAGACCAAG 6AGCTGCAGA 5400 

AGCA6ATCAC CAAGATCCAG AACTTCCGC6 TGTACTACC6 CGACTCCCGC GACCCCGTGT 5460 

GGAAGGGCCC CGCCAAGCTG C7GTGGAA6G GCGAGGGCGC CGTGGTGATC CAGGACAACT 5520 

CCGACATCAA GGTGGTOCCC CGCCGCAAGG CCAAGAICA7 CCGCGACCAC GGCAAGCAGA 5580 

TGGCCG6CGA CGACTGCGT6 0CCTCCC6CC A66AC6AGGA CTAACACATe GAAAAGATTA 5640 
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GTAAAACACC ATAGGCCGCT CTAGAGGATC CAAGCTTATC GATACCGTCG ACCTCGAGGG 5700 

CCXAGATCTA ATTCACCCCA CCAGTGCAGG CTGCCTATCA GAAAGTGGTG GCTGGTGTGG 5760 

CTAATGCCCT GGCCCT^CAAG TATCACTAAG CTCGCTTTCT TGCTGTCCAA TTTCTATTAA 5820 

AGGrrCCTTT GTTCCCTAAG TCCAACTACT AAACTGGGGG ATATTATGAA GGGCCTTGAG 58 BO 

CATCTGGATT CTGCCT A ATA AAAAACATTT ATTTrCATTG CAATGATGTA TTEAAATTAT 5940 

TTCTGAATAT TTTACTAAAA AGGGAAIGTG 6GAGGTCAGT GCATTTAAAA CATAAAGAAA 6000 

TGAAGftGCTA GTTCAAACCT TG6GAAAATA CACTATATCT TAAACTCCAT GAAAGAAGGT 6060' 

GAG6CZGCAA ACAqCTAATG CACATTGGCA ACAGCCCCTG ATGCCTATGC CTTATTCATC 6120 

CCTCAGAAAA GGATTCAAGT A6AGGCTTGA TTTG6AGGTT AAAGTTTT6C TATGCTGTAT 6180 

TTTACArrAC TTATTSTTTT AGCTGTCCTC ATGAATGTCT TTTCACTACC CATTTGCTTA 6240 

TCCTGCATCT CTCAGCCTTG ACTCCACTCA GTTCTCTTGC TTAGAGATAC CACCTTTCCC 6300 

CrcAAGTGTT CCTTCCATGT TTTACGGCGA GATGGTTTCT CCTCGCCTGG CCACTCAGCC 6360 

TTAGTTGTCT d'OTfCiTCTT ATA6AGGTCT ACTTGAAGAA GG&AAAACAG GGGGCATGGT 6420 

TTGACTGTCC TGTGAGCCCT TCTTCCCTGC CTCCCCX»CT CACACTGACC CGGAATCCCT 6480 

CGACATGGCA GTCTAGATCA TTCTT6AAGA C6AAAGG6CC 7C6TGAXAC6 CCTATTTTTA 6540 

TAGQTTAA3G TCAT6ATAAT AATGGTTTCT TAGAC6TCAG 6T6GCACTTT TC6GGGAAAT 6600 

GltSCGCGGAA CCCCl'ATTTG m ' A'm 'TTC TAAATACATT CAAATATGTA TCCGCTCATG 6660 

AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGTA7 GAGTATTCAA 6720 

CATTTCCCTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT TTTTGCTCAC 6780 

CTAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGGGTGCACG AGTGGGTTAC 6840 

ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT 6900 

CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG TATTATCCCG TATTGACGCC 6960 

GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT TGAGTACTCA 7020 

CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA <^GAATTA7G CAGTGCTGCC 7080 

ATAACCAT6A GTGATAACAC T6CGGCCAAC TTACTPCTGA CAACGATCGG AGGACCGAA6 7140 

6A6CTAACC6 CTTTTTTGCA CAACATG6G6 GATCATGTAA CTCGCCTTGA TCGTTGGGAA 7200 

CCGGA6CTGA ATGAAGCCAT ACCAAACGAC GAGC6T6ACA CCACGATGCC TGTAGCAATO 7260 

GCAACAAC6T TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA 7320 

TTAATA6ACT GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG 7380 

CCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGA6C GTGGCTCTCG CGGTATCATT 7440 

GCAGCAC7GG GGCCAGATGG TAAOCCCTCC CGTAIC61A6 TTATCTACAC 6ACG6GGAGT 7500 

CAGGCAACTA TGGATGAAC6 AAATAGACAG ATCGCTGA6A TASG TCCC T C ACTGATTAA6 7560 

CATTG6TAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATT GA TTT AAAACTTCAT 7620 

TTTTAATTTA AAAGGATCTA 6GTGAA6ATC CTTTTT 6A TA ATCTCATGAC CAAAATCCCT 7680 

TAACCT6ACT TTXGTTCCA CTGAGCGTCA GACCCCCSTAG AAAAGATCAA AGGATCTTCT 7740 

TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA 7800 

GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC 7860 

AGCAGAGC6C AGAXACCAAA TACTGTTCTT CTAGTGTAGC C6TAGTTA6G CCACCACTTC 7920 

AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 7980 

GCCAGTOGC6 ATAAGTCGT6 TCTTACCGOC TTGGACXCAA GACGAlAGr? ACCGGATAA6 8040 

6CGCABCGGT CGGGCTGAAC GGGGG G TT CO TGCACACAGC CCAGCTT6GA GC6AACGACC 8100 

TACACCGAAC TGAGAiZACCT ACAGCGTGAS CTATGAGAAA GC6CCACGCT TCCCGTUIGGG 8160 

A6AAAGGC0G ACAOGTATCC GGTAAGC6GC AG6GTCGGAA CAGGAGA6C6 CACGAGGGAS 8220 

CTTCCAGGGG GAAACGCCT6 GTATCTTTAT AGTCCTGTCG GGTTTCQCCA CCTCTGACTT 8280 

GAGCGTCGAT ITI T G T GA TG CTCGTCAGGG GGGCGGAGCC TATG6AAAAA CGCCAGCAAC 8340 

GGArGCGCCG CGTGCG6CTG CTGGAGATGG CGGACGCGAT GGATATGTTC TGCCAAGGGT 8400 

TGG7TTGCGC ATTCACAGTT CTCCGCAAGA ATTGATTGGC TCCAATZCTT GGAGTGGTGA 8460 
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ATCCCmGC GAG6TGCCGC CG6CTTCCAT TCAGGTCGA6 GTGGCCCGGC TCCATGCACC 8520 

GCGACGCAAC GCGG06AGGC AGACAAGGTA TAG60CG6C6 CCTACAATCC ATGCCAACCC 8580 

GTTCCATGTG CTCGCC6AGG CGGCATAAA7 CCCCGTGAC6 ATCA6CGGTC CAATGATCI3A 8640 

AGITAGGCTG GTAAGAGCCG CSAGCGATCC TTGAAGCTGT CCCTGATGGT CGTCATCTAC 8700 

CTGCCTGGAC AGCATGGCCT GCAACGCGG6 CATCCCGATG CCGCCGGAA6 CGAGAAGAAT 8760 

CATAATGGGG AAGGCCATCC AGCCTCGCGT CGGGGAGCTT TTTGCAAAAG CCXAGGCCTC 8020 

CAAAAAAGCC TCCTCACTAC TTCTGGAATA GCTCASaSGC CGAG6C6GCC TCGQCCTCTG 8880_ 
CATAAATAAA AAAAATTAGT CAGCCATG 8908 
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